Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the past, I have sometimes used Craigslist's RSS feature to watch for the appearance of items. That is to say, the ability of CL to export a keyword search as an RSS feed that you can save in your feed reader and watch for updates.

In what ways did Craigwatch improve over RSS? If it still operated today, why would I use that instead of scraping RSS items directly from Craigslist to my feed reader?

Did Craigwatch use RSS feeds, or was it scraping material from the rendered HTML?



Useful free stuff is typically gone in minutes. Only junk remains posted.

Its RSS feeds have a refresh period of 1 hour, which is simply too long to have the opportunity to grab hot items.

As Craigslist has very shallow category listings in its "free" category, where you can't separate furniture from clothes as an example, the site forces users to rely on manually refreshing pages, which change very often.

I have better user experience using Craigslist from my iPad. Craigslist appears to selectively make the service available to app makers but not to websites.

In the past 3 days of shopping for a specific car within about 15 Craigslist local communities, I managed to get automatically blocked by the service several times. What I am looking for is very specific and I don't mind traveling a bit to get what I want.

I ended up having to write code with sleep timers to reduce my number of web queries once I narrowed what I wanted.

An essential feature in the iPad app is that I can cross off listings that are no longer interesting to me and highlight favorites. There is no listing cross-off feature for the website.

I guess my next step is to add that feature. laugh


Wiseleo, is that really true about the refresh rate? RSS readers have a configurable rate. Often there is a one hour default in order to be reasonably nice to the server. Basically, a Craigslist RSS item is just a URL with the search parameters embedded. Maybe I'm wrong, but I suspect that whenever you fetch this URL, the server executes the search and produces the results as RSS XML items, so the fetch rate is controlled by you (your reader). Or are you saying there is some additional throttling on the server side, so that RSS-based searches do not see up-to-the-second updates that are visible through the Web interface? So that no matter how often you refresh the RSS feed, you don't see new items that are already visible via HTML?


Gotta love being downvoted for an informative post... :)

If you look at the XML returned for RSS feeds, you will notice:

<syn:updateBase>2014-07-16T14:25:18-07:00</syn:updateBase> <syn:updateFrequency>1</syn:updateFrequency> <syn:updatePeriod>hourly</syn:updatePeriod>

That is the configuration setting for RSS readers to not update more than once per hour.

See spec: http://web.resource.org/rss/1.0/modules/syndication/

Craigslist has aggressive blocking for excessive GET requests. The RSS feed contains only the first 250 characters of text description of the ad. Thus, you will see that something got added but you will not see its details. More importantly, attributes are not available as part of the RSS feed.

That means that you were interested in specific colors of a car, you would need to define a separate RSS feed for red, yellow, and so on.

It is hard to test whether the RSS search results are additionally throttled, but you will likely get blocked while testing. :)

While legitimately shopping, I got blocked multiple times for becoming more efficient.

RSS is not hard to read. Here are some red manual transmission cars in San Francisco Bay Area http://sfbay.craigslist.org/search/cto?auto_paint=7&auto_tra...


I used to give away good free stuff, but it's just not woth the hassle(going to the wrong people, people not showing up, the spam, over agressive Flaggers--you people need to get a life, etc.)

While I'm here. What's with the begging to lower an already low price on a service, or item. I used to price an item realistically, then knock off 40 percent. People still want to Haggle over the price. I now just double the price I think it's worth.

There's a huge need for another Craigslist. The site has gotten too Ghetto.


Yeah, I am going to work on it once I am done launching my main product. My main product is for in-person customer acquisition, so I should be able to seed this properly.

I am thinking:

* Schedule pickup (I made a scheduling product a few years ago with a nice algorithm)

* [Innovative revenue source that is the secret sauce, not involving ads or subscriptions]

* Request hold on the item (with ability to flag no shows)

* Hyper-detailed categories (I have a very interesting plan for that)

Open API, but anyone including us is free to clone what you build :)


all that would remove the one killer feature of CL... no logins.

what you describe is just a ebay clone with pickup instead of delivery.


Looks like it was using XPath to parse the HTML: https://github.com/beaulm/craigwatch/blob/master/post.php

I imagine the 'selling' point was users not even having to know what RSS even means, let alone how to script it.


Thanks for the link. Time to try out Google app engine's PHP preview.


I do that too, but it can be pretty late. At least hours after the item is posted. Which isn't bad for some things, but sometimes you _really_ want to know. :D

I've never actually heard of Craigswatch, but I think it's something most engineers have done at least a couple of times. I've got a script I use to watch the FredMiranda forums any time I need some new camera gear, for example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: