As a recently unemployed person, I have needed to find ways to search for the largest number of jobs, while not having to sift through too much cruft. When looking at various job search sites, it can be difficult to use the same search queries on each.
Yahoo Pipes is a mature web aggregation and editing suite that allows you to gather information from all over the web, modify it, and then output it again. We will be gathering job listings from job search sites via RSS, filtering the feeds, then outputting the collection in a single RSS feed.
After creating a new pipe (you will need a yahoo account), you will have your pipe fetch job listings from two sites, craigslist and indeed.com. I have created an example pipe so that you can follow along through the process.
Craigslist’s RSS setup allows you to grab the RSS feeds from any of the job subcategories, or even from the main job category for your area. I chose to use the main job category. The address for the Kansas City job listings is kansascity.craigslist.org/jjj/. To get the RSS feed address, just add the text “index.rss” to the end. You now have kansascity.craigslist.org/jjj/index.rss to add to the pipe.
The Yahoo pipes interface uses a modular design that allows you to drag individual modules into the editing area as you need them and any number of times that you need them. To add the RSS feed, you will need to drag the “fetch site feed” module from the left column into the editing area.
Once the module has been added, paste the RSS feed address into the first text box. If at any point you want to see what is coming out of any individual module, you can click on the module and the debug box at the bottom of the page will update with that module’s output.
Head on over to indeed.com where you will get your second RSS feed.
First, search for jobs that fit your needs. On the results page, a link to the RSS feed for jobs matching your search will appear in the far right column. Copy the linked address, and paste it into a second “fetch site feed” module in the pipe.
Now that our pipe is fetching possible job listings, we need to filter the feeds to pare down the options to just the ones that interest us. To filter the craigslist feed, add the “Filter” module under the “Operators” section.
To connect modules, click and drag from the output dot on the bottom of the “Fetch Feed” module and release on the imput dot on the top of the “Filter” module.
Notice that I only “piped” the craigslist feed into the new filter module, but not the Indeed feed. That is because this filter module only permits certain RSS entries that have keywords that match the jobs I want. The craigslist feed needs to be filtered this way because it has all job listings, but the Indeed feed contains only what we searched for and doesn’t need to be filtered again.
Once the craigslist feed is properly filtered, you need to combine the two feeds together so that you can work with all of the job listings at the same time. This is accomplished with the “union” module as shown.
Next, you need to remove any unwanted job listings with another filter.
Be careful with this filter. Only add keywords that ONLY appear in job listings that you know you dont’ want or you could be cutting out good job opportunities. Notice that this filter is set to “block” items and not “permit”.
Next, you need to have three “unique” modules to strip out repeat items based on title, hyperlink, and job description.
Lastly, to sort the output into chronological order, you need to add the “Sort” module and set it to sort on item.pubDate in descending order.
The pipe has now been designed and only needs to be saved and run before you can add it to your favorite RSS reader. My favorite is google reader, though there are many options. You’ll find the save button at the top of the page. Wait for the pipe to be saved, then click the link at the top of the page to run the pipe.
The webpage will open a new tab or window with the pipe’s dashboard. This is where you can see the results of your pipe, and find the output RSS feed. The RSS feed link is located just above the output list. Copy the RSS link into your favorite RSS reader and you’re done.
You have now condensed multiple job search sites into one convenient location!