Innocent Musanzikwa, Developer in Calgary, AB, Canada
Innocent is available for hire
Hire Innocent

Innocent Musanzikwa

Verified Expert  in Engineering

Data Engineer and Developer

Location
Calgary, AB, Canada
Toptal Member Since
August 10, 2021

Inno是一位经验丰富的数据工程师和开发人员,在过去的十年里,他在非洲和北美的顶级零售数据分析公司iri工作,并在过去的几年里担任自由顾问. As a SQL and ETL developer, 他使用行业标准技术(如Kimball和DataVaults)创建了高质量的数据仓库. As a data engineer, Inno使用几种最新的尖端技术,在本地和云上构建了高度健壮和可扩展的数据管道.

Portfolio

Darwill, Inc.
SQL, Tableau, Python,数据工程,数据分析,ETL,数据仓库...
SFL Scientific LLC
SQL, SQL Server集成服务(SSIS), MariaDB, Microsoft SQL Server...
Airiam Holdings, LLC
商业智能(BI), SQL, api, SQL Server DBA,多维建模...

Experience

Availability

Part-time

Preferred Environment

SQL, PySpark, Python, Hadoop, Apache Hive, Azure Synapse, Oracle, SQL Server Integration Services (SSIS), Azure Data Factory, Data Warehousing

The most amazing...

...我设计的大数据仓库和数据集成解决方案——使用Python, SQL, ADF, Hadoop, Hive, and Spark—won an RFP in Canada out of six competitors.

Work Experience

Data Engineer

2022 - 2022
Darwill, Inc.
  • 使用AWS Redshift和Aurora数据库构建Tableau仪表板和可视化.
  • 为自定义ETL任务和临时请求创建运行Python的AWS Lambda函数.
  • 管理AWS Redshift和Aurora数据库,设计数据仓库和数据迁移.
  • 使用AWS技术栈重新设计了客户端的数据仓库,并通过引入运行Python管道的联邦查询和Lambda函数改进了他们的迁移过程, as well as overhauling their Tableau dashboards.
Technologies: SQL, Tableau, Python,数据工程,数据分析,ETL,数据仓库, Amazon Web Services (AWS), Relational Databases, Data Cleansing, Data Science, Databases, PostgreSQL, AWS Lambda, Database Development, Data Visualization, Azure SQL Data Warehouse (SQL DW), Database Modeling, MySQL, Entity Relationships, Business Analytics, Database Design

Data Engineer

2022 - 2022
SFL Scientific LLC
  • 就现有的SSIS设计不良的数据集成项目提供咨询,并帮助确定瓶颈和低效率.
  • 使用SSIS重新设计现有的数据管道,以提高效率和可扩展性.
  • 执行SQL调优和SQL代码审查以提高流程效率.
Technologies: SQL, SQL Server集成服务(SSIS), MariaDB, Microsoft SQL Server, Data Transformation, Python, Database Schema Design, iPaaS, CI/CD Pipelines, Relational Databases, Stored Procedure, Data Analysis, T-SQL (Transact-SQL), SQL DML, Database Development, Data Analytics, Data Visualization, Azure SQL Data Warehouse (SQL DW), Database Modeling, Entity Relationships, Tableau, Business Analytics, Database Design

BI and Data Warehouse Expert

2021 - 2022
Airiam Holdings, LLC
  • 设计和开发数据管道,集成来自Quickbooks API的数据, Sage Intacct API, and spreadsheets into Azure SQL.
  • Designed and developed a data warehouse in Azure SQL.
  • 使用Power BI设计和创建业务报告和KPI仪表板.
  • 开发复杂的SQL脚本来管理数据转换和加速集成.
Technologies: 商业智能(BI), SQL, api, SQL Server DBA,多维建模, Relational Databases, Microsoft Power BI, Cloud, Git, REST APIs, Synapse, DAX, Dashboard Design, Dashboards, Stored Procedure, Tableau, Data Analysis, T-SQL (Transact-SQL), SQL DML, Database Development, Data Analytics, Microsoft Power Automate, Data Visualization, Database Modeling, Entity Relationships, Business Analytics, Database Design

Data Analyst for Migration Project

2021 - 2021
JLL - JLLT Data
  • 开发数据管道,将数据从Salesforce集成到Microsoft SQL.
  • Designed advanced SQL code, e.g.、CTE、存储过程和管理数据转换的函数.
  • 执行SQL调优以提高ETL效率和流程可伸缩性.
  • 咨询标准操作程序和最佳情况.
Technologies: SQL, T-SQL (Transact-SQL), ETL, Salesforce, Data Migration, Relational Databases, Microsoft Power BI, SQL Server Reporting Services (SSRS), Stored Procedure, Data Analysis, Google Sheets, SQL DML, Database Development, Data Analytics, Database Modeling, Entity Relationships, Tableau, Business Analytics, Database Design

Director | Data Engineering

2019 - 2021
IRI
  • 开发Azure数据工厂管道,集成来自Apache Hive的数据, HDFS, OAuth 2 APIs, and various flat-file types into Azure SQL.
  • Managed a team of onshore and offshore big data developers, assigning tasks and tracking the progress on Jira.
  • 监督新数据源和正在进行的项目的数据策略和建议.
  • Mentored big data engineers to help them develop their skills.
  • 根据客户要求或技术变更,构建新的数据模型并升级旧的数据仓库.
Technologies: Python, Apache Hive, Hadoop, Azure Synapse, Azure Data Factory, Bash Script, SQL, Azure SQL, Databricks, Data Engineering, ETL, Data Modeling, Databases, Azure, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Apache Airflow, Data Integration, Big Data, T-SQL (Transact-SQL), Data Migration, Snowflake, Data Build Tool (dbt), Apache Kafka, ELT, SQL Server Integration Services (SSIS), Data Transformation, Dimensional Modeling, Relational Databases, Microsoft Power BI, Cloud, SQL DML, Database Development, Azure SQL Data Warehouse (SQL DW), Database Modeling, Entity Relationships, Database Design

ETL Architect

2016 - 2019
IRI
  • Developed SQL-based data warehouses on-premise and on the cloud.
  • 集成了从平面文件到基于云的数据源(如Snowflake)的各种数据源, AWS and data lakes into Azure Data Warehouse, and Apache Hive on Hadoop.
  • 创建了可扩展的数据管道,提高了现有管道的效率.
  • 培训和提高新数据开发人员的技能,并参与代码审查.
  • 维护所有业务数据组件和策略的系统文档.
Technologies: SQL Server Integration Services (SSIS), Azure Synapse, Azure Data Factory, Databricks, PySpark, SQL, Oracle, Apache Hive, Hadoop, Data Warehouse Design, Data Engineering, ETL, Data Modeling, SQL Stored Procedures, Databases, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Data Integration, Big Data, BigQuery, JavaScript, T-SQL (Transact-SQL), Data Migration, Snowflake, Amazon Web Services (AWS), Amazon Elastic MapReduce (EMR), ELT, APIs, Data Transformation, MariaDB, SQL Server DBA, Dimensional Modeling, Relational Databases, Microsoft Power BI, Cloud, REST APIs, SQL DML, Database Development, Azure SQL Data Warehouse (SQL DW), Database Modeling, Entity Relationships, Performance Tuning, Dynamic SQL

SQL Lead Developer

2012 - 2016
IRI
  • Developed SQL-based data warehouses and data marts.
  • Wrote SQL queries to provide data for SSRS reports.
  • 根据客户端需求,ETL进程使用SSIS、Talend、DataStage.
  • 使用SQL Server Reporting Services (SSRS)创建自定义业务报表.
  • Managed junior developers and ran stand-up development meetings.
Technologies: SQL, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), PSQL, MySQL, Data Warehousing, Data Engineering, ETL, Data Modeling, SQL Stored Procedures, Databases, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Data Integration, Big Data, T-SQL (Transact-SQL), Data Migration, ELT, Data Transformation, Dimensional Modeling, Relational Databases, Microsoft Power BI, REST APIs, SSAS, Dashboard Design, Dashboards, SQL DML, Database Development, SSRS Reports, Azure SQL Data Warehouse (SQL DW), Database Modeling, SQL Server 2015, Entity Relationships, Business Analytics, Performance Tuning, Dynamic SQL

SQL/ETL Developer and Consultant

2010 - 2012
Mi9 Retail (formerly JustEnough Software Corporation)
  • Managed SQL replication between mobile devices and SQL Server.
  • 使用Kimball方法为报告目的创建SQL数据仓库.
  • 使用SQL Server集成服务(SSIS)设计和开发ETL包.
  • 在SQL Server Reporting Services (SSRS)中设计和开发报表.
  • 对部署到生产环境中的任何代码执行数据库调优和代码审查.
Technologies: SQL, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Microsoft SQL Server, Data Engineering, ETL, Data Modeling, SQL Stored Procedures, Databases, Data, Data Architecture, Business Intelligence (BI), Data Pipelines, Data Integration, Big Data, T-SQL (Transact-SQL), Data Migration, Data Transformation, Relational Databases, Microsoft Power BI, SSAS, SQL DML, Database Development, SSRS Reports, Database Modeling, SQL Server 2015, Entity Relationships

Data Migration from Azure SQL to Snowflake

http://github.com/innowarue/ADF
该项目涉及使用Azure data Factory数据管道将数据从Azure SQL数据库迁移到Snowflake数据仓库. 根据我的技能和对数据工厂的熟练程度,我花了几分钟来创建它.

我用我的Azure和Snowflake帐户替换了真实的数据源,以便在不损害机密性的情况下公开提供项目.

Data Integration from OAuth2 API

我创建了一个自动化的数据管道,将可通过基于oauth2的API以JSON格式访问的数据集成到基于云的数据仓库解决方案中. 该解决方案在Databricks上使用Python和Spark集成到Azure数据工厂管道中.

SQL Server Replication to Mobile Devices

我创建了一个复制系统,在移动设备和微软SQL Server之间同步数据. 现场销售代表将从现场收集信息, 使用SQL CE将其上传到SQL Server,并通过我设置的移动复制从SQL Server下载任何更新.

In-place Data Integration for an Acquisition

我为一家公司的收购和合并创建了一个就地ETL集成, 将两家公司的数据整合到一个仓库中,同时不断向客户服务和零售服务团队提供每周报告.

Kafka Streaming and Data Integration

我创建了一个自动化的数据管道来集成通过Kafka流访问的数据, 使用Spark和Python将其导入到Spark Streaming中,并通过Hive数据仓库解决方案将其加载到Cloudera Hadoop文件系统中.

Languages

SQL, Python, Bash Script, T-SQL (Transact-SQL), Snowflake, Stored Procedure, SQL DML, Scala, JavaScript, Bash

Frameworks

Hadoop, Spark, Windows PowerShell, ADF

Libraries/APIs

PySpark, REST APIs, Spark Streaming

Tools

Microsoft Power BI, Tableau, BigQuery, Synapse, SSAS, Apache Airflow, Amazon Elastic MapReduce (EMR), Git, Google Sheets

Paradigms

ETL, Business Intelligence (BI), Dimensional Modeling, Database Development, Database Design, Data Science

Platforms

Amazon Web Services (AWS)、AWS Lambda、Azure、Oracle、Databricks、Apache Kafka、Salesforce、Zeppelin

Storage

Apache Hive, MySQL, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), PSQL, Microsoft SQL Server, SQL Stored Procedures, PostgreSQL, Databases, Data Pipelines, Data Integration, Relational Databases, Database Architecture, RDBMS, Database Modeling, Dynamic SQL, NoSQL, SQL Server DBA, Database Replication, Azure SQL, MariaDB

Other

Azure Data Factory, Data Warehousing, Data Analysis, Data Engineering, Data, Data Architecture, Big Data, Data Migration, ELT, Data Warehouse Design, Data Transformation, Database Schema Design, ETL Tools, Scripting Languages, Data Analytics, Data Visualization, SSRS Reports, Azure SQL Data Warehouse (SQL DW), SQL Server 2015, Entity Relationships, Business Analytics, Performance Tuning, Data Modeling, Cloud, APIs, Dashboard Design, Dashboards, Microsoft Power Automate, Azure Synapse, Web Scraping, Data Build Tool (dbt), iPaaS, CI/CD Pipelines, DAX, Data Cleansing, Azure Databricks

2013 - 2015

Bachelor's Degree in Information Technology

University of South Africa - Pretoria, South Africa

AUGUST 2023 - AUGUST 2025

Databricks Certified Data Engineer Associate

Databricks

AUGUST 2023 - AUGUST 2025

SnowPro Core

Snowflake

DECEMBER 2020 - DECEMBER 2022

Certified Apache Spark and Hadoop Developer

Cloudera

DECEMBER 2019 - PRESENT

Analyzing Big Data with Hive

LinkedIn Learning

DECEMBER 2019 - PRESENT

Advanced NoSQL for Data Science

LinkedIn Learning