Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How did you integrate all the different schemas? RDF, some kind of rule engine, or plain Java, Python, ... ?


Having worked previously at a company that sounds very similar (maybe even the same one?), our approach was primarily using individually written scrapers and API integrations (when available) in Python, utilizing an underlying scraping framework that predated requests. As you might imagine, these integrations required constant maintenance and were often bug-prone, so much so that the company eventually found itself in the position of outsourcing the work... There were attempts to reduce maintenance through the use of a in-house DSL/rules engine, but ultimately, the range of integrations it was able to support was very limited, and the project was scrapped.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: