Browsing Data Science Institute (Workshop Papers) by Title
Now showing items 118-118 of 118
-
µRaptor: A DOM-based system with appetite for hCard elements
(CEUR-WS.org, 2014)This paper describes µRaptor, a DOM-based method to extract hCard microformats from HTML pages stripped of microformat markup. µRaptor extracts DOM sub-trees, converts them into rules, and uses them to extract hCard ...