• Home
  • Job Experience
  • Education
  • Certificate
  • Project
  • Resume
  • Home
  • Job Experience
  • Education
  • Certificate
  • Project
  • Resume
Avatar

Kelvin Ho

Software Engineer

Location: Hong Kong

Phone: +852 6219 7438

Email: hckkelvin@gmail.com

LinkedIn: linkedin.com/in/hckkelvin

Hello!! I'm ...


A diligent software engineer in commercial application development, creating and executing innovative software solutions to enhance business productivity.

Love to use and develop computer and programming skills in practical, find solutions for problems by manage and analyse data.

Highly experienced in all aspects of the software development life cycle and produced high quality documentation for clients.

ToolBox

Where I've Worked

Software Engineer @ EmblocSoft (Hong Kong) Limited
May 2021 - Apr 2022

Engineered modern applications with Scala, Python, Apache HBase, Apache Hadoop, Apache Spark, Apache ECharts.

Built innovative Apache Kafka microservices on top of Microsoft Kubernetes to stream millions of records in real-time.

Installed PostgreSQL cluster with auto-failover, backup scripts and monitor scripts.

Implemented ETL services with kettle and SSIS, enhanced performance of existing programs 30% by redesigning merge join logic.

Provided high quality, filed and organized documentation.

Deployed and integrated software engineered by team and updated integration and deployment scripts to improve continuous integration practices.

My Education Path

2017 - 2020
BSc Computer Science
Swansea University, United Kingdom

Final Year Project: Mobile Application Second-class, lower division

2016 - 2017
Warwick International Foundation Programme
University of Warwick, United Kingdom

International Foundation Programme in Mathematics and Economics

2010 - 2016
Hong Kong Diploma of Secondary Education Examination
La Salle College, Hong Kong

Electives: Geography, Information and Communication Technology

Things I've learnt during leisure time

IBM Data Science Specialization

IBM | Coursera

Show Credential

IBM Data Analyst Specialization

IBM | Coursera

Show Credential

Applied Data Science Specialization

IBM | Coursera

Show Credential

Data Science Fundamentals with Python and SQL

IBM | Coursera

Show Credential

Data Visualization with Tableau Specialization

University of California | Coursera

Show Credential

Introduction to Data Science Specialization

IBM | Coursera

Show Credential

Python for Everybody Specialization

Univeristy of Michigan | Coursera

Show Credential

Web Design for Everybody

Univeristy of Michigan | Coursera

Show Credential

The Fundamentals of Digital Marketing

IBM | Coursera

No Credential

Data-driven Decision Making

PwC | Coursera

Show Credential

Things I've Built

Apache Kafka

I have used Apache Kafka to help clients handled millions of data from different database sources and files and transmitted to Microsoft Azure in real time.

Want to know more?
Apache Kafka

Apache Kafka Official

Apache Zookeeper

Apache Zookeeper Official

Microsoft Azure

Microsoft Azure

Microsoft Kubernetes

Microsoft Kubernetes

PostgreSQL

PostgreSQL Official

PostgreSQL

I helped clients install PostgreSQL cluster with auto-failover, monitor and backup scripts. I also helped clients perform both major and minor upgrades to their PostgreSQL cluster.

Want to know more?

Apache Kafka

My Experience

I helped clients building Apache Kafka cluster on top of Microsoft Kubernetes which handled millions of data from different type of databases and transmitted them to Microsoft Azure in real time. Based on clients' requests, the records of users are stored in Azure SQL Database while the attachement of the records are stored in Azure Blob Storage.

Since Apache Kafka 2.7.0 were used in these projects, Apache Zookeeper is required to install manually which is used to monitor the Apache Kafka cluster.


Technology involoved



Note: Starting from Apache Kafka 2.8.0, Apache Zookeeper was replaced with a self-managed quorum, which means we don't have to install Apache Zookeeper if version of Apache Kafka is >= 2.8.0.


Want to know more about Apache Kafka?

Apache Kafka is a distributed event streaming technology which is used for real-time data integration and streaming data pipelines. It was originally developed by LinkedIn for activity stream data and operational metrics. It was subsequently open-sourced in early 2011 to Apache. It is a distributed and partitioned message system which enable trillions of messages being processed and sent per second to numerous of receivers in real time. It is also higyly fault-tolerant and highly scable.


Apache Kafka is used by thousands of companies

Netflix has two sets of Apache Kafka cluster: Fronting Kafka and Consumer Kafka.


Fronting Kafka clusters are in charge of obtaining messages from producers, which essentially every Netflix application instance is. They serve as data collectors and buffers for systems farther down the line. Consumer Kafka clusters contain a subset of topics routed by Samza for real-time consumers.


In 2016, Netflix already operated 36 Kafka clusters whcih consist more than 4,000 broker instaces for both Fronting Kafka and Consumer Kafka. More than 700 billion messages are ingested with in a day.

Spotify is a digital music, podcast, and video service that gives you access to millions of songs and other content from creators all over the world. It has 82 Million songs on its platform while 422 million active Spotify users as of the first quarter of 2022.


Even though Spotify had decided to migrated from Apache Kafka to Google Cloud Pub/Sub in 2016, the way that Spotify uses Apache Kafka on their platform is still intriguing to observe: Whenever a user listens to a specific song or podcast, Spotify logs the experience as an event and utilizes it as data to learn more about the user's preferences. Additionally, Spotify records infrastructure-level events, such as when a logging server runs out of disk space. Events of more than 300 different types are gathered from Spotify users.

PostgreSQL

My Experience

I helped clients building PostgreSQL cluster with master node and standyby node. Backup scripts and log rotation scripts are also provided with cron. Futhermore, since auto-failover script is also provided, there is no down time for the cluster. Once the master node is down, the auto-failover script will be executed with in a minute (this can be modified based on client's preference). The standy node will then be prompted as master node and two alert emails will be sent to clients: email about the fail of master node and email about the result of failover.


Technology involoved