Skills

For a business-focused overview, visit the Services section. Click on the drop-down titles for a detailed view of skills.

  • AWS Expertise:

    • S3, Athena, Glue Data Catalog, Redshift, Redshift Spectrum, EMR: Tools for storing, querying, and analyzing large datasets, enabling efficient data processing and analytics.

    • Lambda, EKS, ECR, SQS, RDS, VPC: Services for running applications, managing workflows, processing data, and maintaining secure networking.

    • Textract, Transcribe: AI services for extracting information from documents and converting speech to text.

    Azure Expertise:

    • Data Factory, Data Lake Storage Gen 2, SQL Server, Event Hubs, Databricks: Solutions for creating automated data pipelines, securely storing data, and running advanced analytics and machine learning.

    • Function Apps, Logic Apps, API Management, Key Vault: Tools for automating workflows, integrating systems, and protecting sensitive data.

    • Azure DevOps: A platform for managing development workflows, automating deployments, and ensuring smooth CI/CD pipelines.

    Infrastructure Automation Tools:

    • AWS CDK, CloudFormation, Ansible, Terraform: Tools to define and automate the setup of cloud resources, ensuring reproducibility, scalability, and reduced manual effort.

  • Sofware Engineering

    • Background and education in software & data engineering with 10 years of coding experience and software engineering best practices.

      Databases

    • Knowledge and experience using SQL and Non-SQL databases in various storage types (such as columnar and row based) such as PostgreSQL, Redshift, SQL Server, Elasticsearch, AXON, and MongoDB.
      Programming languages & libraries

    • Python, Java, PySpark, SQL, Pandas, Polars: Core programming and query languages, along with libraries for processing and analyzing data efficiently in various formats. Also, knowledegable in various other programming languages and libraries which are not highlighted here.

      Workflow Orchestration and Data Storage:

      • Airflow, Dagster, Oozie: Tools for automating and scheduling data workflows, ensuring smooth and timely operations.

      • Delta Lake: A storage layer that guarantees reliable data management and consistency for large-scale analytics.

      Containerization and Orchestration:

    • Docker, Kubernetes: Technologies for packaging applications into containers and managing them in distributed environments, providing scalability and resource efficiency.

      Development and Collaboration Tools:

      • CI/CD: Automates the build, test, and deployment processes for faster and more reliable software delivery.

      • JIRA, Confluence: Platforms for project management, documentation, and team collaboration.

      • GIT: A version control tool for managing code repositories and enabling collaborative development.

    • Generative AI and LLMs: Expertise in developing advanced solutions using Large Language Models (LLMs) for intelligent automation, natural language processing, and enhancing workflows.

    • OpenAI: Utilize OpenAI's API and tools for building conversational AI, text generation, and other natural language applications.

    • HuggingFace: Leverage HuggingFace's pre-trained models and libraries for tasks like semantic search, text summarization, and custom AI solutions.

    • LlamaIndex: Integrate LlamaIndex for creating and managing knowledge graphs that enable efficient data retrieval and semantic querying.

    • Semantic Kernel: Implement Semantic Kernel for building extensible AI applications that integrate seamlessly into business workflows.

    • Expertise in developing data strategies and data governance frameworks to align with organizational goals.

    • Skilled in stakeholder management, gathering business requirements, and translating them into actionable technical solutions.

    • Experience working Agile to deliver value-driven projects and ensure alignment with business needs.

    • Create and deliver technical training.