Renan Souza, Ph.D.
Data Science and Data Engineering Researcher at IBM Research

CV

All Publications and Patents

Please, feel free to reach me if you need a preprint of a paper not available here.

Contents

Theses

Supporting User Steering in Large-scale Workflows with Provenance Data
R. Souza
COPPE/Federal University of Rio de Janeiro, Ph.D. Thesis, 2019.
[T1] [online] [pdf] [bibtex]
Controlling the Parallel Execution of Workflows Relying on a Distributed Database
R. Souza
COPPE/Federal University of Rio de Janeiro, M.Sc. Thesis, 2015.
[T2] [online] [pdf] [bibtex]
Linked Open Data Publication Strategies: An Application in Network Performance Data (in pt)
R. Souza
DCC/Federal University of Rio de Janeiro, B.Sc. Thesis, 2013.
[T3] [pdf] [bibtex]

Journal Articles

Distributed In-memory Data Management for Workflow Executions
R. Souza, V. Silva, A. Lima, D. Oliveira, P. Valduriez, and M. Mattoso
PeerJ Computer Science, 2021.
[J1] [abstract] [doi] [online] [pdf] [bibtex]
Workflow Provenance in the Lifecycle of Scientific Machine Learning
R. Souza, L. Azevedo, V. Lourenço, E. Soares, R. Thiago, R. Brandão, D. Civitarese, E. Brazil, M. Moreno, P. Valduriez, M. Mattoso, R. Cerqueira, and M. Netto
arXiv preprint Databases (cs.DB), 2020.
[J2] [abstract] [online] [pdf] [bibtex]
Adding Hyperknowledge-enabled data lineage to a machine learning workflow management system for oil and gas
L. Azevedo, R. Souza, R. Brandão, V. Lourenço, M. Costalonga, M. de OC Machado, M. Moreno, and R. Cerqueira
First Break, 2020.
[J3] [doi] [bibtex]
Keeping Track of User Steering Actions in Dynamic Workflows
R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez, and M. Mattoso
Future Generation Computer Systems, 2019.
[J4] [abstract] [doi] [pdf] [bibtex]
Adding Domain Data to Code Profiling Tools to Debug Workflow Parallel Execution
V. Silva, L. Neves, R. Souza, A. Coutinho, D. de Oliveira, and M. Mattoso
Future Generation Computer Systems, 2018.
[J5] [doi] [bibtex]
Data Reduction in Scientific Workflows Using Provenance Monitoring and User Steering
R. Souza, V. Silva, A. Coutinho, P. Valduriez, and M. Mattoso
Future Generation Computer Systems, 2017.
[J6] [abstract] [doi] [pdf] [bibtex]
A Hybrid Architecture for Multi-party Conversational Systems
M. de Bayser, P. Cavalin, R. Souza, A. Braz, H. Candello, C. Pinhanez, and J. Briot
arXiv preprint Computation and Language (cs.CL), 2017.
[J7] [online] [pdf] [bibtex]

Conference and Workshop Papers

Cycle Orchestrator: A Knowledge-Based Approach for Structuring Cyclic ML Pipelines in the O&G Industry
R. Brandão, V. Lourenço, M. Machado, L. Azevedo, M. Cardoso, R. Souza, G. Lima, R. Cerqueira, and M. Moreno
International Semantic Web Conference (ISWC), 2020.
[C1] [bibtex]
A Knowledge-Based Approach for Structuring Cyclic Workflows
R. Brandão, V. Lourenço, M. Machado, L. Azevedo, M. Cardoso, R. Souza, G. Lima, R. Cerqueira, and M. Moreno
International Semantic Web Conference (ISWC), 2020.
[C2] [bibtex]
Runtime Steering of Parallel CFD Simulations
R. Souza, J. Camata, M. Mattoso, and A. Coutinho
International Conference on Parallel Computational Fluid Dynamics, 2020.
[C3] [bibtex]
Experiencing ProvLake to Manage the Data Lineage of AI Workflows
L. Azevedo, R. Souza, R. Thiago, E. Soares, and M. Moreno
Meeting in Innovation in Information Systems (EISI) in Brazilian Symposium in Information Systems (SBSI), 2020.
[C4] [bibtex]
Modern Federated Databases: an Overview
L. Azevedo, R. Souza, E. Soares, and M. Moreno
International Conference on Enterprise Information Systems (ICEIS), 2020.
[C5] [bibtex]
Supporting the Training of Physics Informed Neural Networks for Seismic Inversion Using Provenance
R. Souza, A. Codas, J. Nogueira Junior, M. Quinones, L. Azevedo, R. Thiago, E. Soares, M. Cardoso, and L. Martins
American Association of Petroleum Geologists Annual Convention and Exhibition (AAPG), 2020.
[C6] [bibtex]
Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case
R. Thiago, R. Souza, L. Azevedo, E. Soares, R. Santos, W. Santos, M. De Bayser, M. Cardoso, M. Moreno, and R. Cerqueira
European Association of Geoscientists and Engineers (EAGE) Digitalization Conference and Exhibition, 2020.
[C7] [doi] [pdf] [bibtex]
Efficient Runtime Capture of Multiworkflow Data Using Provenance
R. Souza, L. Azevedo, R. Thiago, E. Soares, M. Nery, M. Netto, E. Brazil, R. Cerqueira, P. Valduriez, and M. Mattoso
IEEE International Conference on e-Science (eScience), 2019.
[C8] [abstract] [doi] [pdf] [bibtex]
Managing Data Traceability in the Data Lifecycle for Deep Learning Applied to Seismic Data
R. Souza, E. Brazil, L. Azevedo, R. Ferreira, E. Soares, R. Thiago, M. Nery, V. Torres, and R. Cerqueira
American Association of Petroleum Geologists Annual Convention and Exhibition (AAPG), 2019.
[C9] [online] [bibtex]
Provenance Data in the Machine Learning Lifecycle in Computational Science and Engineering
R. Souza, L. Azevedo, V. Lourenço, E. Soares, R. Thiago, R. Brandão, D. Civitarese, E. Vital Brazil, M. Moreno, P. Valduriez, M. Mattoso, R. Cerqueira, and M. A. S. Netto
Workflows in Support of Large-Scale Science (WORKS) co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2019.
[C10] [abstract] [doi] [pdf] [bibtex]
Towards a human-in-the-loop library for tracking hyperparameter tuning in deep learning development
R. Souza, L. Neves, L. Azeredo, R. Luiz, E. Tady, P. Cavalin, and M. Mattoso
Latin American Data Science (LaDaS) workshop co-located with the Very Large Database (VLDB) conference, 2018.
[C11] [pdf] [bibtex]
Capturing Provenance for Runtime Data Analysis in Computational Science and Engineering Applications
V. Silva, R. Souza, J. Camata, D. de Oliveira, P. Valduriez, A. Coutinho, and M. Mattoso
Provenance and Annotation of Data and Processes - International Provenance and Annotation Workshop (IPAW), 2018.
[C12] [doi] [bibtex]
Provenance of Dynamic Adaptations in User-Steered Dataflows
R. Souza and M. Mattoso
Provenance and Annotation of Data and Processes - International Provenance and Annotation Workshop (IPAW), 2018.
[C13] [doi] [pdf] [bibtex]
Ravel: A MAS orchestration platform for Human-Chatbots Conversations
M. de Bayser, C. Pinhanez, H. Candello, M. Affonso, M. Vasconcelos, M. Guerra, P. Cavalin, and R. Souza
International Workshop on Engineering Multi-Agent Systems (EMAS@AAMAS 2018), 2018.
[C14] [pdf] [bibtex]
Scientific Data Analysis Using Data-Intensive Scalable Computing: the SciDISC Project
P. Valduriez, M. Mattoso, R. Akbarinia, H. Borges, J. Camata, A. Coutinho, D. Gaspar, N. Lemus, J. Liu, H. Lustosa, F. Masseglia, F. Nogueira Da Silva, V. Silva, R. Souza, K. Ocaña, E. Ogasawara, D. Oliveira, E. Pacitti, F. Porto, and D. Shasha
LADaS: Latin America Data Science Workshop, 2018.
[C15] [online] [pdf] [bibtex]
Spark Scalability Analysis in a Scientific Workflow
R. Souza, V. Silva, P. Miranda, A. Lima, P. Valduriez, and M. Mattoso
Simpósio Brasileiro de Banco de Dados (SBBD), 2017.
[C16] [pdf] [bibtex]
Tracking of online parameter fine-tuning in scientific workflows
R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2017.
[C17] [online] [bibtex]
Integrating Domain-data Steering with Code-profiling Tools to Debug Data-intensive Workflows
V. Silva, L. Neves, R. Souza, A. Coutinho, D. Oliveira, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2016.
[C18] [bibtex]
Online Input Data Reduction in Scientific Workflows
R. Souza, V. Silva, A. Coutinho, P. Valduriez, and M. Mattoso
Workflows in Support of Large-Scale Science (WORKS) workshop co-located with the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2016.
[C19] [online] [bibtex]
Applying data warehousing and big data techniques to analyze internet performance
T. Barbosa, R. Souza, S. Cruz, M. Campos, and R. Cottrell, 2016.
[C20] [pdf] [bibtex]
Building a question-answering corpus using social media and news articles
P. Cavalin, F. Figueiredo, M. de Bayser, L. Moyano, H. Candello, A. Appel, and R. Souza
International Conference on Computational Processing of the Portuguese Language, 2016.
[C21] [bibtex]
Enhancing Energy Production with Exascale HPC Methods
J. Camata, J. Cela, D. Costa, A. Coutinho, D. Fernández-Galisteo, C. Jimenez, V. Kourdioumov, M. Mattoso, R. Mayo-García, T. Miras, J. Moríñigo, J. Navarro, P. Navaux, D. De Oliveira, M. Rodríguez-Pascual, V. Silva, R. Souza, and P. Valduriez
CARLA: Latin American High Performance Computing Conference, 2016.
[C22] [doi] [online] [pdf] [bibtex]
Applying future Exascale HPC methodologies in the energy sector
J. Camata, J. Cela, D. Costa, A. Coutinho, D. Fernández-Galisteo, C. Jiménez, V. Kourdioumov, M. Mattoso, R. Mayo-García, T. Miras, J. Moríñigo, J. Navarro, D. de Oliveira, M. Rodríguez-Pascual, V. Silva, R. Souza, and P. Valduriez
Russian Supercomputing Days, 2016.
[C23] [online] [pdf] [bibtex]
Parallel Execution of Workflows Driven by a Distributed Database Management System
R. Souza, V. Silva, D. Oliveira, P. Valduriez, A. Lima, and M. Mattoso
ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2015.
[C24] [online] [pdf] [bibtex]
Uma Abordagem para Publicação de Dados de Proveniência de Workflows Científicos na Web Semântica
R. Castro, R. Souza, V. Silva, K. Ocaña, D. Oliveira, and M. Mattoso
Simpósio Brasileiro de Banco de Dados (SBBD), 2015.
[C25] [bibtex]
Linked open data publication strategies: Application in networking performance measurement data
R. Souza, L. Cottrell, B. White, M. Campos, and M. Mattoso
ASE BigData/SocialCom/CyberSecurity, Stanford, CA, 2014.
[C26] [pdf] [bibtex]

Patents

Continuous storage of data in a system with limited storage capacity
L. Real, M. dos Santos, and R. Souza, 2021.
[P1] [bibtex]
Metadata-based scientific data characterization driven by a knowledge database at scale
R. Souza, R. Mozart, F. Da Silva, A. Vital, and V. Silva, 2021.
[P2] [bibtex]
Creating coordinated multi-chatbots using natural dialogues by means of knowledge base
M. De Bayser, A. Braz, P. Cavalin, F. Figueiredo, and R. Souza, 2018.
[P3] [bibtex]
System and method for managing artificial conversational entities enhanced by social knowledge
A. Braz, P. Cavalin, F. Figueiredo, M. De Bayser, and R. Souza, 2018.
[P4] [bibtex]
Predicting user question in question and answer system
A. Appel, A. Gama Leal, and R. Souza, 2017.
[P5] [bibtex]

Last updated on 2021-06-07.

Source code to generate this website and the CV: https://github.com/renan-souza/cv