Jay Chi Resume

Jay Chiin/lostjaylostjaylostjay.xyz
SUMMARYBackend Developer — production experience designing and maintaining large-scale crawling and data service systems. Built backend services and distributed crawling pipelines using Python, Java, FastAPI, Spring Boot, Kafka, MongoDB, MySQL and Redis, with coroutine-based concurrency, proxy management, retry/backoff strategies, task state management, and structured data parsing. Supported crawler platforms including Purple resume crawling pipeline and Quake headhunter platform crawlers. Provided real-time crawling APIs for certificate verification, web search, and on-demand data retrieval, supporting low-latency business data access.
WORK EXPERIENCECrawler Engineer Intern -> Crawler Engineer
Kanzhun / BOSS Zhipin, Beijing
Python, Java, JS/Android Reverse Engineering, Anti-bot Analysis, AI Agents
Apr 2024 - Present
Built and maintained large-scale web crawling systems for multimodal AI training data and business data demands, covering recruitment, resume, headhunter, and competitor intelligence scenarios
Provided real-time crawling APIs for internal business workflows, including certificate verification, web search, and on-demand data retrieval
Developed backend services and crawler task workflows using Python, Java, FastAPI, Spring Boot, Kafka, MongoDB, and Redis
Designed distributed crawling pipelines with coroutine-based concurrency, proxy management, task scheduling, retry handling, state management, and structured data parsing
Performed anti-bot and risk-control analysis, covering device/browser fingerprinting, TLS/HTTP/2 fingerprinting, network traffic capture, captcha solving, and customized patched/stealth Playwright runtimes
Reverse-engineered JavaScript, Android, and WeChat Mini Program workflows to analyze request signatures, encryption logic, authentication flows, and anti-crawling mechanisms
Applied cryptographic analysis including AES, RSA, and message digest algorithms to reproduce protected request parameters and verify data integrity
Designed the architecture of an AI-agent-assisted crawling platform, integrating Model Context Protocol, context management, SSE streaming, and tool orchestration to support crawler configuration, debugging.
PROJECTMultimodal AI Training Data Scraping
Large-scale Crawling · AI Training Data
Python, Java, Distributed Crawling, Proxy Management, Anti-bot Analysis
Apr 2024 – Present
Built and maintained large-scale crawling workflows for multimodal AI training data collection, covering text, image, video, document, and structured web data from platforms including YouTube, Zhihu, Baidu Wenku, and other high-risk web sources
Supported PB-scale annual data collection volume, contributing to a data supply system with up to 5PB/year collection capacity for AI training and business data delivery
Developed distributed crawling pipelines with coroutine-based concurrency, proxy management, retry/backoff strategies, task state management, and structured data parsing to ensure stable high-throughput delivery
Analyzed platform anti-bot mechanisms, including request behavior limits, access-control triggers, fingerprinting signals, Android/Web API constraints, and JS-rendering barriers
Business Intelligence Crawling
Recruitment Data · Corporate Landscape · Marketing Intelligence
Python, Java, Playwright, Kafka, MongoDB, Redis, Reverse Engineering
Jan 2025 – Present
Built and maintained multiple business-intelligence crawling systems covering headhunter resumes, headhunter jobs, competitor job postings, corporate landscape data, ad promotion data, marketing balance, and transaction records
Supported Purple for headhunter-platform resume crawling, including account/session handling, resume fetching, structured parsing, task state management, and callback-based result delivery
Supported Quake for headhunter-platform job crawling, including platform login flows, job list/detail retrieval, position parsing, session management, and crawler stability improvements
Built competitor job crawling workflows to track job posting updates, online/offline status changes, and market signals from competitor platforms; supported downstream CRM analysis to clean and classify data into two business categories
Maintained corporate landscape crawling workflows for company profiles, licenses, qualification records, and related corporate metadata, with primary ownership of Hong Kong company data sources
Implemented authenticated crawling workflows for ad promotion, account-level marketing metrics, balance information, promotion records, and transaction details to support financial and marketing data reconciliation
Improved crawler robustness through proxy management, browser automation, captcha handling, cookie/session management, retry/backoff strategies, structured error handling, and reverse engineering of request signatures and anti-crawling mechanisms
AI Crawling System
AI Agent-assisted Crawling Platform
Python, Java, MCP, OpenAI Agents SDK, SSE Streaming
Jul 2025 – Present
Designed the architecture of an AI-agent-assisted crawling platform for crawler configuration, field parsing, seed management, and debugging workflows
Integrated Model Context Protocol, context management, SSE streaming, and tool orchestration to connect LLM reasoning with crawler tools and browser automation
Built agent workflows for request/response analysis, parsing-field recommendation, seed browsing, and crawler configuration assistance
Improved multi-turn agent reliability by handling context consistency, tool-call state, MCP service lifecycle, and streaming response behavior
EDUCATIONBachelor of Management in Information Management and Information Systems
University of International Business and Economics
Sep 2020 - Jul 2024
SKILLSBackend & Distributed Systems
Python·Java·FastAPI·Spring Boot·Asyncio·Concurrency·Socket Programming·Distributed Systems·Kafka·MongoDB·MySQL·Redis
Web Crawling & Reverse Engineering
Web Scraping·Anti-bot & Risk Control Analysis·Device & Browser Fingerprinting·Captcha Solving·JavaScript / Android / WeChat Mini Program Reverse Engineering·Cryptography: AES, RSA, Message Digest Algorithms·TLS / HTTP/2 Fingerprinting·Browser Automation: Playwright
AI Agent Engineering
Model Context Protocol·Context Management·SSE Streaming·Agent Orchestration
DevOps & Infrastructure
Docker Compose·Linux·Git·Nginx·Proxy Networking
Languages
Mandarin Chinese (Native)·English (Professional Working Proficiency)

Jay Chiin/lostjaylostjaylostjay.xyz
SUMMARYBackend Developer — production experience designing and maintaining large-scale crawling and data service systems. Built backend services and distributed crawling pipelines using Python, Java, FastAPI, Spring Boot, Kafka, MongoDB, MySQL and Redis, with coroutine-based concurrency, proxy management, retry/backoff strategies, task state management, and structured data parsing. Supported crawler platforms including Purple resume crawling pipeline and Quake headhunter platform crawlers. Provided real-time crawling APIs for certificate verification, web search, and on-demand data retrieval, supporting low-latency business data access.
Jay Chiin/lostjaylostjaylostjay.xyz
SUMMARYBackend Developer — production experience designing and maintaining large-scale crawling and data service systems. Built backend services and distributed crawling pipelines using Python, Java, FastAPI, Spring Boot, Kafka, MongoDB, MySQL and Redis, with coroutine-based concurrency, proxy management, retry/backoff strategies, task state management, and structured data parsing. Supported crawler platforms including Purple resume crawling pipeline and Quake headhunter platform crawlers. Provided real-time crawling APIs for certificate verification, web search, and on-demand data retrieval, supporting low-latency business data access.
WORK EXPERIENCECrawler Engineer Intern -> Crawler Engineer
Kanzhun / BOSS Zhipin, Beijing
Python, Java, JS/Android Reverse Engineering, Anti-bot Analysis, AI Agents
Apr 2024 - Present
Built and maintained large-scale web crawling systems for multimodal AI training data and business data demands, covering recruitment, resume, headhunter, and competitor intelligence scenarios
Provided real-time crawling APIs for internal business workflows, including certificate verification, web search, and on-demand data retrieval
Developed backend services and crawler task workflows using Python, Java, FastAPI, Spring Boot, Kafka, MongoDB, and Redis
Designed distributed crawling pipelines with coroutine-based concurrency, proxy management, task scheduling, retry handling, state management, and structured data parsing
Performed anti-bot and risk-control analysis, covering device/browser fingerprinting, TLS/HTTP/2 fingerprinting, network traffic capture, captcha solving, and customized patched/stealth Playwright runtimes
Reverse-engineered JavaScript, Android, and WeChat Mini Program workflows to analyze request signatures, encryption logic, authentication flows, and anti-crawling mechanisms
Applied cryptographic analysis including AES, RSA, and message digest algorithms to reproduce protected request parameters and verify data integrity
Designed the architecture of an AI-agent-assisted crawling platform, integrating Model Context Protocol, context management, SSE streaming, and tool orchestration to support crawler configuration, debugging.
PROJECTMultimodal AI Training Data Scraping
Large-scale Crawling · AI Training Data
Python, Java, Distributed Crawling, Proxy Management, Anti-bot Analysis
Apr 2024 – Present
Built and maintained large-scale crawling workflows for multimodal AI training data collection, covering text, image, video, document, and structured web data from platforms including YouTube, Zhihu, Baidu Wenku, and other high-risk web sources
Supported PB-scale annual data collection volume, contributing to a data supply system with up to 5PB/year collection capacity for AI training and business data delivery
Developed distributed crawling pipelines with coroutine-based concurrency, proxy management, retry/backoff strategies, task state management, and structured data parsing to ensure stable high-throughput delivery
Analyzed platform anti-bot mechanisms, including request behavior limits, access-control triggers, fingerprinting signals, Android/Web API constraints, and JS-rendering barriers
Business Intelligence Crawling
Recruitment Data · Corporate Landscape · Marketing Intelligence
Python, Java, Playwright, Kafka, MongoDB, Redis, Reverse Engineering
Jan 2025 – Present
Built and maintained multiple business-intelligence crawling systems covering headhunter resumes, headhunter jobs, competitor job postings, corporate landscape data, ad promotion data, marketing balance, and transaction records
Supported Purple for headhunter-platform resume crawling, including account/session handling, resume fetching, structured parsing, task state management, and callback-based result delivery
Supported Quake for headhunter-platform job crawling, including platform login flows, job list/detail retrieval, position parsing, session management, and crawler stability improvements
Built competitor job crawling workflows to track job posting updates, online/offline status changes, and market signals from competitor platforms; supported downstream CRM analysis to clean and classify data into two business categories
Maintained corporate landscape crawling workflows for company profiles, licenses, qualification records, and related corporate metadata, with primary ownership of Hong Kong company data sources
Implemented authenticated crawling workflows for ad promotion, account-level marketing metrics, balance information, promotion records, and transaction details to support financial and marketing data reconciliation
Improved crawler robustness through proxy management, browser automation, captcha handling, cookie/session management, retry/backoff strategies, structured error handling, and reverse engineering of request signatures and anti-crawling mechanisms
AI Crawling System
AI Agent-assisted Crawling Platform
Python, Java, MCP, OpenAI Agents SDK, SSE Streaming
Jul 2025 – Present
Designed the architecture of an AI-agent-assisted crawling platform for crawler configuration, field parsing, seed management, and debugging workflows
Integrated Model Context Protocol, context management, SSE streaming, and tool orchestration to connect LLM reasoning with crawler tools and browser automation
Built agent workflows for request/response analysis, parsing-field recommendation, seed browsing, and crawler configuration assistance
Improved multi-turn agent reliability by handling context consistency, tool-call state, MCP service lifecycle, and streaming response behavior
EDUCATIONBachelor of Management in Information Management and Information Systems
University of International Business and Economics
Sep 2020 - Jul 2024
SKILLSBackend & Distributed Systems
Python·Java·FastAPI·Spring Boot·Asyncio·Concurrency·Socket Programming·Distributed Systems·Kafka·MongoDB·MySQL·Redis
Web Crawling & Reverse Engineering
Web Scraping·Anti-bot & Risk Control Analysis·Device & Browser Fingerprinting·Captcha Solving·JavaScript / Android / WeChat Mini Program Reverse Engineering·Cryptography: AES, RSA, Message Digest Algorithms·TLS / HTTP/2 Fingerprinting·Browser Automation: Playwright
AI Agent Engineering
Model Context Protocol·Context Management·SSE Streaming·Agent Orchestration
DevOps & Infrastructure
Docker Compose·Linux·Git·Nginx·Proxy Networking
Languages
Mandarin Chinese (Native)·English (Professional Working Proficiency)