All Publications and Patents

Please, feel free to reach me if you need a preprint of a paper not available here.

Theses
Journal Articles
Conference and Workshop Papers
Patents

Theses

Supporting User Steering in Large-scale Workflows with Provenance Data
R. Souza
COPPE/Federal University of Rio de Janeiro, Ph.D. Thesis, 2019.
[T1] [online] [pdf] [bibtex]

Controlling the Parallel Execution of Workflows Relying on a Distributed Database
R. Souza
COPPE/Federal University of Rio de Janeiro, M.Sc. Thesis, 2015.
[T2] [online] [pdf] [bibtex]

Linked Open Data Publication Strategies: An Application in Network Performance Data (in pt)
R. Souza
DCC/Federal University of Rio de Janeiro, B.Sc. Thesis, 2013.
[T3] [pdf] [bibtex]

Journal Articles

Workflows Community Summit 2024: Future Trends and Challenges in Scientific Workflows
D. Bard, K. Chard, S. de Witt, I. Foster, C. Goble, W. Godoy, J. Gustafsson, U. Haus, S. Hudson, L. Los, R. Souza, and others
Distributed, Parallel, and Cluster Computing (cs.DC), 2024.
[J1] [doi] [online] [pdf] [bibtex]

A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores
L. Azevedo, R. Souza, E. S Soares, R. Thiago, J. Tesolin, A. Oliveira, and M. Moreno
arXiv preprint Databases (cs.DB), 2023.
[J2] [doi] [online] [pdf] [bibtex]

Workflows Community Summit 2022: A Roadmap Revolution
R. da Silva, R. Badia, V. Bala, D. Bard, P. Bremer, I. Buckley, S. Caino-Lores, K. Chard, C. Goble, S. Jha, ..., R. Souza, and et al.
arXiv preprint Distributed, Parallel, and Cluster Computing (cs.DC), 2023.
[J3] [doi] [online] [pdf] [bibtex]

Workflow Provenance in the Lifecycle of Scientific Machine Learning
R. Souza, L. G. Azevedo, V. Lourenço, E. Soares, R. Thiago, R. Brandão, D. Civitarese, E. Vital Brazil, M. Moreno, P. Valduriez, M. Mattoso, R. Cerqueira, and M. A. S. Netto
Concurrency and Computation: Practice and Experience, 2021.
[J4] [abstract] [online] [pdf] [bibtex]

Distributed In-memory Data Management for Workflow Executions
R. Souza, V. Silva, A. Lima, D. Oliveira, P. Valduriez, and M. Mattoso
PeerJ Computer Science, 2021.
[J5] [abstract] [doi] [online] [pdf] [bibtex]

@article{souza_distributed_2021,
  abstract = {Complex scientific experiments from various domains are typically modeled as workflows and executed on large-scale machines using a Parallel Workflow Management System (WMS). Since such executions usually last for hours or days, some WMSs provide user steering support, i.e., they allow users to run data analyses and, depending on the results, adapt the workflows at runtime. A challenge in the parallel execution control design is to manage workflow data for efficient executions while enabling user steering support. Data access for high scalability is typically transaction-oriented, while for data analysis, it is online analytical-oriented so that managing such hybrid workloads makes the challenge even harder. In this work, we present SchalaDB, an architecture with a set of design principles and techniques based on distributed in-memory data management for efficient workflow execution control and user steering. We propose a distributed data design for scalable workflow task scheduling and high availability driven by a parallel and distributed in-memory DBMS. To evaluate our proposal, we develop d-Chiron, a WMS designed according to SchalaDB's principles. We carry out an extensive experimental evaluation on an HPC cluster with up to 960 computing cores. Among other analyses, we show that even when running data analyses for user steering, SchalaDB's overhead is negligible for workloads composed of hundreds of concurrent tasks on shared data. Our results encourage workflow engine developers to follow a parallel and distributed data-oriented approach not only for scheduling and monitoring but also for user steering.},
  author = {Souza, R. and Silva, V. and Lima, A. A. B. and Oliveira, D. and Valduriez, P. and Mattoso, M.},
  doi = {10.7717/peerj-cs.527},
  journal = {PeerJ Computer Science},
  link = {https://peerj.com/articles/cs-527/},
  pages = {1--30},
  pdf = {https://arxiv.org/ftp/arxiv/papers/2105/2105.04720.pdf},
  title = {Distributed In-memory Data Management for Workflow Executions},
  volume = {7},
  year = {2021}
}

Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development
R. da Silva, H. Casanova, K. Chard, ..., R. Souza, and et al.
arXiv preprint Distributed, Parallel, and Cluster Computing (cs.DC), 2021.
[J6] [online] [pdf] [bibtex]

Adding Hyperknowledge-enabled data lineage to a machine learning workflow management system for oil and gas
L. Azevedo, R. Souza, R. Brandão, V. Lourenço, M. Costalonga, M. de Machado, M. Moreno, and R. Cerqueira
First Break, 2020.
[J7] [doi] [bibtex]

Keeping Track of User Steering Actions in Dynamic Workflows
R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez, and M. Mattoso
Future Generation Computer Systems, 2019.
[J8] [abstract] [doi] [online] [pdf] [bibtex]

@article{souza_keeping_2019,
  abstract = {In long-lasting scientific workflow executions in HPC machines, computational scientists (the users in this work) often need to fine-tune several workflow parameters. These tunings are done through user steering actions that may significantly improve performance (e.g., reduce execution time) or improve the overall results. However, in executions that last for weeks, users can lose track of what has been adapted if the tunings are not properly registered. In this work, we build on provenance data management to address the problem of tracking online parameter fine-tuning in dynamic workflows steered by users. We propose a lightweight solution to capture and manage provenance of the steering actions online with negligible overhead. The resulting provenance database relates tuning data with data for domain, dataflow provenance, execution, and performance, and is available for analysis at runtime. We show how users may get a detailed view of the execution, providing insights to determine when and how to tune. We discuss the applicability of our solution in different domains and validate its ability to allow for online capture and analyses of parameter fine-tunings in a real workflow in the Oil and Gas industry. In this experiment, the user could determine which tuned parameters influenced simulation accuracy and performance. The observed overhead for keeping track of user steering actions at runtime is less than 1\% of total execution time.},
  author = {Souza, Renan and Silva, Vítor and Camata, Jose J. and Coutinho, Alvaro L. G. A. and Valduriez, Patrick and Mattoso, Marta},
  doi = {10.1016/j.future.2019.05.011},
  issn = {0167-739X},
  journal = {Future Generation Computer Systems},
  keyword = {Dynamic workflows, Computational steering, Provenance data, Parameter tuning},
  link = {https://doi.org/10.1016/j.future.2019.05.011},
  pages = {624--643},
  pdf = {https://hal-lirmm.ccsd.cnrs.fr/lirmm-02127456/document},
  title = {Keeping Track of User Steering Actions in Dynamic Workflows},
  volume = {99},
  year = {2019}
}

Adding Domain Data to Code Profiling Tools to Debug Workflow Parallel Execution
V. Silva, L. Neves, R. Souza, A. Coutinho, D. de Oliveira, and M. Mattoso
Future Generation Computer Systems, 2018.
[J9] [doi] [bibtex]

Data Reduction in Scientific Workflows Using Provenance Monitoring and User Steering
R. Souza, V. Silva, A. Coutinho, P. Valduriez, and M. Mattoso
Future Generation Computer Systems, 2017.
[J10] [abstract] [doi] [pdf] [bibtex]

@article{Souza2017Data,
  abstract = {Scientific workflows need to be iteratively, and often interactively, executed for large input datasets. Reducing data from input datasets is a powerful way to reduce overall execution time in such workflows. When this is accomplished online (i.e., without requiring the user to stop execution to reduce the data, and then resume), it can save much time. However, determining which subsets of the input data should be removed becomes a major problem. A related problem is to guarantee that the workflow system will maintain execution and data consistent with the reduction. Keeping track of how users interact with the workflow is essential for data provenance purposes. In this paper, we adopt the “human-in-the-loop” approach, which enables users to steer the running workflow and reduce subsets from datasets online. We propose an adaptive workflow monitoring approach that combines provenance data monitoring and computational steering to support users in analyzing the evolution of key parameters and determining the subset of data to remove. We extend a provenance data model to keep track of users’ interactions when they reduce data at runtime. In our experimental validation, we develop a test case from the oil and gas domain, using a 936-cores cluster. The results on this test case show that the approach yields reductions of 32\% of execution time and 14\% of the data processed.},
  author = {Souza, Renan and Silva, Vítor and Coutinho, Alvaro L. G. A. and Valduriez, Patrick and Mattoso, Marta},
  doi = {10.1016/j.future.2017.11.028},
  issn = {0167-739X},
  journal = {Future Generation Computer Systems},
  keyword = {Scientific Workflows, Human in the Loop, Online Data Reduction, Provenance Data, Dynamic Workflows},
  pages = {481--501},
  pdf = {https://hal-lirmm.ccsd.cnrs.fr/lirmm-01679967/document},
  title = {Data Reduction in Scientific Workflows Using Provenance Monitoring and User Steering},
  volume = {110},
  year = {2017}
}

A Hybrid Architecture for Multi-party Conversational Systems
M. de Bayser, P. Cavalin, R. Souza, A. Braz, H. Candello, C. Pinhanez, and J. Briot
arXiv preprint Computation and Language (cs.CL), 2017.
[J11] [online] [pdf] [bibtex]

Conference and Workshop Papers

Workflow Provenance in the Computing Continuum for Responsible, Trustworthy, and Energy-Efficient AI
R. Souza, S. Caino-Lores, M. Coletti, T. Skluzacek, A. Costan, F. Suter, M. Mattoso, and R. Silva
IEEE International Conference on e-Science, 2024.
[C1] [abstract] [pdf] [bibtex]

Integrating Evolutionary Algorithms with Distributed Deep Learning for Optimizing Hyperparameters on HPC System
M. Coletti, R. Souza, T. Skluzacek, F. Suter, and R. Silva
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2024.
[C2] [bibtex]

Eco-Driven AI-HPC: Optimizing Energy Efficiency in Distributed Scientific Workflows
R. Silva, W. Shin, F. Suter, A. Gainaru, R. Souza, D. Dietz, and S. Jha
Energy-Efficient Computing for Science Workshop, 2024.
[C3] [bibtex]

Towards Cross-Facility Workflows Orchestration through Distributed Automation
T. Skluzacek, R. Souza, M. Coletti, F. Suter, and R. Silva
Practice and Experience in Advanced Research Computing (PEARC 24), 2024.
[C4] [doi] [online] [bibtex]

Advancing Computational Earth Sciences: Innovations and Challenges in Scientific HPC Workflows
R. da Silva, K. Maheshwari, T. Skluzacek, R. Souza, and S. Wilkinson
European Geosciences Union (EGU), 2024.
[C5] [bibtex]

HKPoly: A Polystore Architecture to Support Data Linkage and Queries on Distributed and Heterogeneous Data
L. Azevedo, R. Souza, E. Soares, R. Thiago, J. Tesolin, A. Oliveira, and M. Moreno
Proceedings of the 20th Brazilian Symposium on Information Systems (SBSI), 2024.
[C6] [doi] [online] [bibtex]

Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability
R. Souza, T. Skluzacek, S. Wilkinson, M. Ziatdinov, and R. da Silva
IEEE International Conference on e-Science, 2023.
[C7] [abstract] [doi] [online] [pdf] [bibtex]

ProvLight: Efficient Workflow Provenance Capture on the Edge-to-Cloud Continuum
D. Rosendo, M. Mattoso, A. Costan, R. Souza, D. Pina, P. Valduriez, and G. Antoniu
IEEE International Conference on Cluster Computing, 2023.
[C8] [doi] [online] [pdf] [bibtex]

Context-aware Execution Migration Tool for Data Science Jupyter Notebooks on Hybrid Clouds
R. Cunha, L. Real, R. Souza, B. Silva, and M. Netto
IEEE International Conference on e-Science, 2021.
[C9] [doi] [pdf] [bibtex]

Supporting Polystore Queries using Provenance in a Hyperknowledge Graph
L. Azevedo, R. Souza, E. Soares, R. Thiago, A. Oliveira, and M. Moreno
International Semantic Web Conference (ISWC), 2021.
[C10] [pdf] [bibtex]

User Steering Support in Large-scale Workflows
R. Souza
PhD Thesis Contest: Brazilian Symposium on Databases (SBBD), 2021.
[C11] [pdf] [bibtex]

A Recommender for Choosing Data Systems based on Application Profiling and Benchmarking
E. Soares, R. Souza, R. Thiago, M. Machado, and L. Azevedo
Brazilian Symposium on Databases (SBBD), 2021.
[C12] [bibtex]

Cycle Orchestrator: A Knowledge-Based Approach for Structuring Cyclic ML Pipelines in the O&G Industry
R. Brandão, V. Lourenço, M. Machado, L. Azevedo, M. Cardoso, R. Souza, G. Lima, R. Cerqueira, and M. Moreno
International Semantic Web Conference (ISWC), 2020.
[C13] [bibtex]

A Knowledge-Based Approach for Structuring Cyclic Workflows
R. Brandão, V. Lourenço, M. Machado, L. Azevedo, M. Cardoso, R. Souza, G. Lima, R. Cerqueira, and M. Moreno
International Semantic Web Conference (ISWC), 2020.
[C14] [bibtex]

Runtime Steering of Parallel CFD Simulations
R. Souza, J. Camata, M. Mattoso, and A. Coutinho
International Conference on Parallel Computational Fluid Dynamics, 2020.
[C15] [bibtex]

Experiencing ProvLake to Manage the Data Lineage of AI Workflows
L. Azevedo, R. Souza, R. Thiago, E. Soares, and M. Moreno
Innovation Summit on Information Systems (EISI) in Brazilian Symposium in Information Systems (SBSI), 2020.
[C16] [bibtex]

Modern Federated Databases: an Overview
L. Azevedo, R. Souza, E. Soares, and M. Moreno
International Conference on Enterprise Information Systems (ICEIS), 2020.
[C17] [bibtex]

Supporting the Training of Physics Informed Neural Networks for Seismic Inversion Using Provenance
R. Souza, A. Codas, J. Nogueira Junior, M. Quinones, L. Azevedo, R. Thiago, E. Soares, M. Cardoso, and L. Martins
American Association of Petroleum Geologists Annual Convention and Exhibition (AAPG), 2020.
[C18] [bibtex]

Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case
R. Thiago, R. Souza, L. Azevedo, E. Soares, R. Santos, W. Santos, M. De Bayser, M. Cardoso, M. Moreno, and R. Cerqueira
European Association of Geoscientists and Engineers (EAGE) Digitalization Conference and Exhibition, 2020.
[C19] [doi] [pdf] [bibtex]

Efficient Runtime Capture of Multiworkflow Data Using Provenance
R. Souza, L. Azevedo, R. Thiago, E. Soares, M. Nery, M. Netto, E. Brazil, R. Cerqueira, P. Valduriez, and M. Mattoso
IEEE International Conference on e-Science, 2019.
[C20] [abstract] [doi] [online] [pdf] [bibtex]

@inproceedings{souza_efficient_2019,
  abstract = {Computational  Science  and  Engineering  (CSE) projects are typically developed by multidisciplinary teams. Despite being part of the same project, each team manages its own workflows, using  specific  execution  environments  and  data processingtools. Analyzing the data processed by all workflows globally is a core task in a CSE project. However, this analysis is hard because the data generated by these workflows are not integrated. In addition, since these workflows may take a long time to execute, data analysis needs to be done at runtime to reduce cost and time of the CSE project. A typical solution in scientific data analysis is to capture and relate the data in a provenance database while the workflows run, thus allowing for data analysisat runtime. However, the main problem is that such data capture competes with the running workflows, adding significant overhead to their execution. To mitigate this problem, we introduce in this paper a system called ProvLake, which adopts design principles for providing efficientdistributed data capture from the workflows. While capturing the data, ProvLake logically integrates and ingests them into a provenance database ready for analyses at runtime. We validated  ProvLake ina  real  use  case  in  the  O&G  industry encompassing four workflows that process 5TB datasets for a deep learning classifier. Compared with Komadu, the closest solution that meets our goals, our approach enables runtime multiworkflow data analysis with much smaller overhead, such as 0.1\%.},
  author = {Souza, Renan and Azevedo, Leonardo and Thiago, Raphael and Soares, Elton and Nery, Marcelo and Netto, Marco and Brazil, Emilio Vital and Cerqueira, Renato and Valduriez, Patrick and Mattoso, Marta},
  booktitle = {IEEE International Conference on e-Science},
  doi = {10.1109/eScience.2019.00047},
  keyword = {Multiworkflow provenance, Multi-Data Lineage, Data Lake Provenance, ProvLake},
  link = {https://doi.org/10.1109/eScience.2019.00047},
  pages = {1--10},
  pdf = {https://hal-lirmm.ccsd.cnrs.fr/lirmm-02265932/document},
  title = {Efficient Runtime Capture of Multiworkflow Data Using Provenance},
  year = {2019}
}

Managing Data Traceability in the Data Lifecycle for Deep Learning Applied to Seismic Data
R. Souza, E. Brazil, L. Azevedo, R. Ferreira, E. Soares, R. Thiago, M. Nery, V. Torres, and R. Cerqueira
American Association of Petroleum Geologists Annual Convention and Exhibition (AAPG), 2019.
[C21] [online] [bibtex]

Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering
R. Souza, L. Azevedo, V. Lourenço, E. Soares, R. Thiago, R. Brandão, D. Civitarese, E. Vital Brazil, M. Moreno, P. Valduriez, M. Mattoso, R. Cerqueira, and M. A. S. Netto
Workflows in Support of Large-Scale Science (WORKS) co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2019.
[C22] [abstract] [doi] [pdf] [bibtex]

@inproceedings{souza_provenancedata_2019,
  abstract = {Machine Learning (ML) has become essential in several industries. In Computational Science and Engineering (CSE), the complexity of the ML lifecycle comes from the large variety of data, scientists' expertise, tools, and workflows. If data are not tracked properly during the lifecycle, it becomes unfeasible to recreate a ML model from scratch or to explain to stakeholders how it was created. The main limitation of provenance tracking solutions is that they cannot cope with provenance capture and integration of domain and ML data processed in the multiple workflows in the lifecycle while keeping the provenance capture overhead low. To handle this problem, in this paper we contribute with a detailed characterization of provenance data in the ML lifecycle in CSE; a new provenance data representation, called PROV-ML, built on top of W3C PROV and ML Schema; and extensions to a system that tracks provenance from multiple workflows to address the characteristics of ML and CSE, and to allow for provenance queries with a standard vocabulary. We show a practical use in a real case in the Oil and Gas industry, along with its evaluation using 48 GPUs in parallel.},
  author = {Souza, Renan and Azevedo, Leonardo and Lourenço, Vítor and Soares, Elton and Thiago, Raphael and Brandão, Rafael and Civitarese, Daniel and Vital Brazil, Emilio and  Moreno, Marcio and  Valduriez, Patrick and  Mattoso, Marta and Cerqueira, Renato and A. S. Netto, Marco},
  booktitle = {Workflows in Support of Large-Scale Science ({WORKS}) co-located with the {ACM}/{IEEE} International Conference for High Performance Computing, Networking, Storage, and Analysis ({SC})},
  doi = {10.1109/WORKS49585.2019.00006},
  keyword = {Machine Learning Lifecycle, Workflow Provenance, Computational Science and Engineering},
  pages = {1--10},
  pdf = {https://arxiv.org/pdf/1910.04223},
  title = {Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering},
  year = {2019}
}

Towards a human-in-the-loop library for tracking hyperparameter tuning in deep learning development
R. Souza, L. Neves, L. Azeredo, R. Luiz, E. Tady, P. Cavalin, and M. Mattoso
Latin American Data Science (LaDaS) workshop co-located with the Very Large Database (VLDB) conference, 2018.
[C23] [pdf] [bibtex]

Capturing Provenance for Runtime Data Analysis in Computational Science and Engineering Applications
V. Silva, R. Souza, J. Camata, D. de Oliveira, P. Valduriez, A. Coutinho, and M. Mattoso
Provenance and Annotation of Data and Processes - International Provenance and Annotation Workshop (IPAW), 2018.
[C24] [doi] [bibtex]

Provenance of Dynamic Adaptations in User-Steered Dataflows
R. Souza and M. Mattoso
Provenance and Annotation of Data and Processes - International Provenance and Annotation Workshop (IPAW), 2018.
[C25] [doi] [pdf] [bibtex]

Ravel: A MAS orchestration platform for Human-Chatbots Conversations
M. de Bayser, C. Pinhanez, H. Candello, M. Affonso, M. Vasconcelos, M. Guerra, P. Cavalin, and R. Souza
International Workshop on Engineering Multi-Agent Systems (EMAS@AAMAS 2018), 2018.
[C26] [pdf] [bibtex]

Scientific Data Analysis Using Data-Intensive Scalable Computing: the SciDISC Project
P. Valduriez, M. Mattoso, R. Akbarinia, H. Borges, J. Camata, A. Coutinho, D. Gaspar, N. Lemus, J. Liu, H. Lustosa, F. Masseglia, F. Nogueira Da Silva, V. Silva, R. Souza, K. Ocaña, E. Ogasawara, D. Oliveira, E. Pacitti, F. Porto, and D. Shasha
LADaS: Latin America Data Science Workshop, 2018.
[C27] [online] [pdf] [bibtex]

Tracking of online parameter fine-tuning in scientific workflows
R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2017.
[C28] [online] [bibtex]

Spark Scalability Analysis in a Scientific Workflow
R. Souza, V. Silva, P. Miranda, A. Lima, P. Valduriez, and M. Mattoso
Brazilian Symposium on Databases (SBBD), 2017.
[C29] [pdf] [bibtex]

Parallel Execution of Workflows driven by Distributed Database Techniques
R. Souza
MSc Thesis Contest: Brazilian Symposium on Databases (SBBD), 2017.
[C30] [pdf] [bibtex]

Online Input Data Reduction in Scientific Workflows
R. Souza, V. Silva, A. Coutinho, P. Valduriez, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2016.
[C31] [online] [bibtex]

Integrating Domain-data Steering with Code-profiling Tools to Debug Data-intensive Workflows
V. Silva, L. Neves, R. Souza, A. Coutinho, D. Oliveira, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2016.
[C32] [bibtex]

Applying future Exascale HPC methodologies in the energy sector
J. Camata, J. Cela, D. Costa, A. Coutinho, D. Fernández-Galisteo, R. Souza, C. Jiménez, V. Kourdioumov, M. Mattoso, R. Mayo-García, T. Miras, J. Moríñigo, J. Navarro, D. de Oliveira, M. Rodríguez-Pascual, V. Silva, and P. Valduriez
Russian Supercomputing Days, 2016.
[C33] [online] [pdf] [bibtex]

Building a question-answering corpus using social media and news articles
P. Cavalin, F. Figueiredo, M. de Bayser, L. Moyano, H. Candello, A. Appel, and R. Souza
International Conference on Computational Processing of the Portuguese Language, 2016.
[C34] [bibtex]

Enhancing Energy Production with Exascale HPC Methods
J. Camata, J. Cela, D. Costa, A. Coutinho, D. Fernández-Galisteo, C. Jimenez, V. Kourdioumov, M. Mattoso, R. Mayo-García, T. Miras, J. Moríñigo, J. Navarro, P. Navaux, D. De Oliveira, M. Rodríguez-Pascual, V. Silva, R. Souza, and P. Valduriez
CARLA: Latin American High Performance Computing Conference, 2016.
[C35] [doi] [online] [pdf] [bibtex]

Parallel Execution of Workflows Driven by a Distributed Database Management System
R. Souza, V. Silva, D. Oliveira, P. Valduriez, A. Lima, and M. Mattoso
ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2015.
[C36] [online] [pdf] [bibtex]

Uma Abordagem para Publicação de Dados de Proveniência de Workflows Científicos na Web Semântica
R. Castro, R. Souza, V. Silva, K. Ocaña, D. Oliveira, and M. Mattoso
Brazilian Symposium on Databases (SBBD), 2015.
[C37] [bibtex]

Applying data warehousing and big data techniques to analyze internet performance
T. Barbosa, R. Souza, S. Cruz, M. Campos, and R. Cottrell
International Conference on Internet Applications, Protocols, and Services (NETAPPS), 2015.
[C38] [pdf] [bibtex]

Linked open data publication strategies: Application in networking performance measurement data
R. Souza, L. Cottrell, B. White, M. Campos, and M. Mattoso
ASE BigData/SocialCom/CyberSecurity, Stanford, CA, 2014.
[C39] [pdf] [bibtex]

Patents

Shortened narrative instruction generator for software code change
M. Netto, L. Real, B. Silva, and R. Souza, 2024.
[P1] [online] [bibtex]

Data transformation for acceleration of context migration in interactive computing notebooks
L. Real, R. Cunha, R. Souza, and M. Netto, 2023.
[P2] [online] [bibtex]

Remotely healing crashed processes
M. Netto, B. Silva, R. Cunha, R. Souza, and L. Real, 2023.
[P3] [online] [bibtex]

Asset identification for collaborative projects in software development
L. Real, R. Cunha, M. dos Santos, and R. Souza, 2022.
[P4] [online] [bibtex]

Program context migration
L. Real, M. Netto, R. Cunha, R. Souza, and A. Braz, 2022.
[P5] [online] [bibtex]

Model Document Creation in Source Code Development Environments using Semantic-aware Detectable Action Impacts
A. Appel, C. De Freitas, R. Souza, C. Mendes, A. Vital, N. Dos, S. Marcelo, M. Stelmar Netto, P. Avegliano, and C. Villas, 2022.
[P6] [online] [bibtex]

Continuous storage of data in a system with limited storage capacity
L. Real, M. dos Santos, and R. Souza, 2021.
[P7] [online] [bibtex]

Metadata-based scientific data characterization driven by a knowledge database at scale
R. Souza, R. Mozart, F. Da Silva, A. Vital, and V. Silva, 2021.
[P8] [online] [bibtex]

Creating coordinated multi-chatbots using natural dialogues by means of knowledge base
M. De Bayser, A. Braz, P. Cavalin, F. Figueiredo, and R. Souza, 2018.
[P9] [online] [bibtex]

System and method for managing artificial conversational entities enhanced by social knowledge
A. Braz, P. Cavalin, F. Figueiredo, M. De Bayser, and R. Souza, 2018.
[P10] [online] [bibtex]

Predicting user question in question and answer system
A. Appel, A. Gama Leal, and R. Souza, 2017.
[P11] [online] [bibtex]

Last updated on 2025-06-26.

Complete version of the curriculum vitae: Renan-Souza-CV-full.pdf.

Source code to generate this website and the CV: https://github.com/renan-souza/cv.

All Publications and Patents

Contents

Theses

Journal Articles

Conference and Workshop Papers

Patents