CV
Feel free to reach out if you have any questions!
Basics
Name | Siddharth Parekh |
spparekh@andrew.cmu.edu | |
Phone | (332) 248-8513 |
Location | Pittsburgh, PA |
Education
-
2021.08 - 2025.05 Pittsburgh, PA
B.S. in Computer Science
Carnegie Mellon University, Pittsburgh, PA
SCS Concentration in Machine Learning
Machine Learning and AI:- 10-315 Introduction to Machine Learning
- 11-411 Natural Language Processing
- 11-485 Introduction to Deep Learning
- 10-422 Foundations of Learning, Game Theory, and Their Connections
- 10-708 Probabilistic Graphical Models *
- 10-720 Convex Optimization *
Computer Science:- 15-213 Introduction to Computer Systems
- 15-210 Parallel and Sequential Algorithms and Data Structures
- 15-251 Great Theoretical Ideas in Computer Science
- 15-451 Algorithms Design and Analysis
- 15-445 Database Systems
Mathematics:- 15-151 Mathematical Foundations of Computer Science
- 21-241 Matrices and Linear Tranformations
- 21-259 Calculus in Three Dimensions
- 36-218 Probability Theory for Computer Scientists
Computational Finance:- 21-270 Introduction to Mathematical Finance
- 21-378 Mathematics of Fixed Income Markets
* - graduate course
Work
-
2022.06 - 2022.08 Mumbai, India
AI Intern - NLP
MikoAI
Streamlined the multilingual personality module of the Miko robot, enhancing its language processing capabilities across 8 languages.
- Benchmarked open-source machine translation models towards optimizing cost-efficiency without compromising performance.
- Developed a neural classifier for question answering, achieving linear speedup over traditional vector search.
Awards
- 2021 - 2025
Dean's List with High Honors
CMU School of Computer Science
Fall 2021, Fall 2022, Spring 2023, Fall 2024
Publications
-
2024 AliGATr: Graph-based layout generation for form understanding
EMNLP 2024 Findings
Forms constitute a large portion of layout-rich documents that convey information through key-value pairs. In this paper, we present AliGATr, a graph-based model that uses a generative objective to represent complex grid-like layouts that are often found in forms. Using a grid-based graph topology, our model learns to generate the layout of each page token by token in a data efficient manner, performing at par with state-of-the-art models.
Skills
Programming Languages | |
C/C++ | |
Python | |
Java | |
JavaScript | |
Julia | |
OCaml |
Machine Learning | |
PyTorch | |
Tensorflow | |
Transformers | |
Scikit-Learn | |
NumPy | |
Pandas |
Cloud | |
AWS | |
Google Cloud Platform |
Miscellaneous | |
Git | |
Docker |
Languages
English | |
Native |
Gujarati | |
Native |
Hindi | |
Fluent |
Spanish | |
Beginner |