Two very useful tools to improve your website's content discovery by search and tagging
26th September 2012
We are always looking at ways to help make our client website's content easier to find for visitors. This involves tagging, crosslinking, also reads and ofcourse searching. Some places we can use ready tools and at others we have to do a DIY project from scratch.
Here are two tools which would be of immense value to any webmaster looking to improve his site's content discovery.
This is the master blaster of search engine softwares. ? Available in different versions for all sizes of websites from small (50 pages) to huge (200,000 pages).
Positives:
- Its really fast and thorough, creating a word bank and link bank of all the words and phrases in the site.
- Results are brilliant, like a personal google for your site.
- Its easy to implement with 5-6 data files and a couple of search code files.
- ? Works on multiple platforms i.e. html, php, asp etc.
- Its not too expensive for the power and usefulness.
- View full feature list.
Drawbacks:
- Its an offline indexer which means it will not update the search as content is added online. You have to run it everytime new content is added to the site and needs to be reindexed. Tedious for sites with daily uploads.
- Search output is limited and you do not get access to the raw data. Since its indexing the site anyways would have liked more usage options like 'also reads' etc.
Overall a powerful tool to quickly and reliably provide a top quaility search option to your site irrespective of its size.
If you have a bunch of content without context than AlchemyAPI is for you. In their own words, 'AlchemyAPI is a cloud-based text mining platform providing semantic tagging'. Which means that it will scan your site content and come up with a range of contextual results that will help you sort the content in a better way. Eg. Extracting relevant tags or categorization.
Positives:
- Its a web based API that can be applied on the fly when the page loads or can be pre-run and results can be stored in the db.
- A range of results are possible from the API
- Extracting persons names, company name etc (Entity Extraction)
- Extracting whether the content is positive or negative (Sentiment Analyis)
- Extracting similiar concept words e.g. "Hillary Clinton + Michelle Obama + Laura Bush" == "First Ladies of the United States"
- Extracting keywords for use in tagging and search.
- and many more which you can explore here.
- It's quite fast so it won't slow down page load times if used live.
- Allows a generous number of free api calls per day although they do have enterprise packages.
Drawbacks:
- Results can be sometimes mixed especially in conceptual extraction.
- Gets stuck sometimes if the html page formating of the target website is not correct or has issues.
- Does not scan entire sites but just individual urls so its upto you to build a contextual database by running through the sitemap.
AlchemyAPI is a work under progress and we are looking forward to seeing many more features from them in the future. But for now their set of APIs are excellent for our content discovery needs.
Cheers,
Ron