Fathurrahman Syarief

LinkedIn ā€¢ GitHub

šŸ’» Iā€™m a recent data science graduate from Univesitas Airlangga with hands on experience in delivering end-to-end data science projects, including descriptive and prescriptive analytics. I focus on using data to drive decision and deliver meaningful results

Experience

Bank Rakyat Indonesia

Data Analyst Intern, September 2024 ā€“ present

  • Assigned to the Planning, Budgeting, & Performance Management (PPM) Division
  • Minimizing manual efforts by developing automation tools to automate workload analysis 5000+ BRI units using Python and Excel VBA, reducing processing time from 2 weeks to 3 hours with 100% data accuracy
  • Collaborated on forecasting customer transactions across BRI units using XGBoost, achieving a 5% error rate. The model served as a decision-support tool to anticipate transaction surges

Bangkit Academy led by Google, Tokopedia, Gojek, & Traveloka

Cloud Computing Cohort, August 2023 ā€“ January 2024

  • Collaborated on the development of a food recommendation mobile app for vegan enthusiasts as a capstone project
  • Achieved 92% average precision in food image detection using CSL-YOLO model
  • Deployed the model as a FastAPI Dockerized API on GCE with NVIDIA T4

Central AI

Data Scientist Intern, August 2022 - December 2022

  • Developed a scraper for Indonesia news portal using Selenium and bs4
  • Fine-tuning IndoBERT model to classify sentiment for Indonesian news with 94% accuracy
  • Wrapped the fine-tuned IndoBERT model into an API using flask and deployed on Google Cloud Run
  • Initiated a project to forecast Central AI cloud & local server compute usage using ARIMA, achieving a 15% cost-saving optimization

Universitas Airlangga

Research Assistant, January 2022 ā€“ July 2022

  • Assisted in teaching the Algorithmic Programming subject, helping students understand key concepts in algorithms and data structures
  • Co-authored a research paper on the use of machine learning in healthcare industry

Technical Experience

Technical Tools

I have experience with a breadth of tools for machine learning, data analysis, and data pipelines

  • Programming Languages (high proficiency): Python
  • Programming Languages (some proficiency): R, SQL, MATLAB, VBA
  • Machine Learning Tools: Tensorflow, Scikit-Learn, PyTorch, MLFlow, RapidMiner, SuperAnnotate
  • Cloud Services: Google Cloud Services (Vertex AI, Run, BigQuery, GCS, GKE, GCE)
  • Data Analysis & Visualization Tools: Microsoft Excel, Tableau, SPSS, Looker Studio, KNIME
  • Workflow & Automation Tools: Apache Airflow, Docker, Selenium, Terraform
  • Database Management Tools: PostgreSQL, MongoDB, Redis

Education

2020 - 2024

Bachelor of Data Science (S.Si.D.), Data Science Technology; Universitas Airlangga; Cum Laude


2017 - 2020

Science Major; SMAN 34 Jakarta

Projects

Developed a system to digitize handwritten Indonesian medical prescriptions using OCR. Leveraged YOLOv10 to detect key elements (e.g., drug names, dosages) and TrOCR to convert handwritten text into digital format. Integrated open-source Llama 3.1 to generate easy-to-understand explanations of the prescription

This project supports the Indonesian prescription format and language, helping patients who struggle to read medical prescriptions due to lack of expertise or poor handwriting. It also aims to streamline healthcare workflows by automating the prescription review process, enhancing efficiency and accuracy in medical care

PyTorch

Flask

Transformer

LLM

OpenAI

Hugging Face

SuperAnnotate

SQlite


Tweetoxicity is a web app that utilizes a 98%-accuracy fine-tuned Distilled IndoBERT to predict the sentiment of Twitter/X users based on their recent tweets or retweets. Users can input a username or topic, and the app will scrape the last 100 tweets, analyze the sentiment, and display the results in a dashboard.

PyTorch

FastAPI

Docker

Transformer

bs4

Streamlit


During my internship at Bank Rakyat Indonesia (BRI) as a Data Analyst, I developed a real-time sentiment analysis system for monitoring Google Play Store reviews of the BRI Mobile Banking application, we called the project as project BRImoSentiment

I automated the scraping of Google Play Store reviews for the BRI Mobile Banking app using Python, capturing reviews from the last 24 hours. The pipeline, orchestrated with Apache Airflow, handled scraping, storing data in MongoDB, preprocessing, model inference using a 97.8%-accuracy fine-tuned Distilled IndoBERT optimized with OpenVino, and storing results back in the database. The system also included email alerts for error or no reviews available. The sentiment analysis results were integrated into a dashboard used by another division within BRI to monitor user feedback

PyTorch

Apache Airflow

MongoDB

Transformer

OpenVino

DAG


Built with Selenium and BeautifulSoup, nitter-harvest scrapes Twitter/X data through Nitter (X Mirror), including topics, hashtags, and user tweets.

bs4

Selenium