Microsoft MVP Logo

Windows SharePoint Services v3 has this slick built-in feature (no, not a Feature, a feature... ok... scratch that, let's call it a capability :P) where it will extract the properties of a document and stuff it into the fields in a document library as well as taking changes within document library fields and inject them into the document as properties. Out-of-the-box, WSS v3 has document parsers for the following types of files:

  • Office 2007 XML file types: DOCX, DOCM, XLSX, XLSM, PPTX, and PPTM
  • HTML file types: HTM, HTML, MHT, MHTM, and ASPX
  • DOC, XLS, PPT, MSG, PUB, and XML

But what if you want to create your own document parser for your own file types? No sweat, just like so many other things in this next SharePoint release (WSS & MOSS), it's quite extensible.

Andrew May recently added two fantastic multi-part series posts to his blog that go into great depth on document parsers. The content is quite technical... no surprise... it's going into the WSS v3 SDK! If you don't want to wait for Beta 2 Tech Refresh, go check them out (and don't forget to get another one of his posters... this time on document parsers)!

» XML Document Property Parsing in SharePoint - Part 1: XML Parser Overview» XML Document Property Parsing in SharePoint - Part 2: Using Content Types to Specify XML Document Properties» XML Document Property Parsing in SharePoint - Part 3: Determining Document Content Type for XML Parsing» XML Document Property Parsing in SharePoint - Part 4: Specifying Document Content Type for XML Parsing» XML Document Property Parsing in SharePoint - Part 5: Specifying XPath Namespace Prefixes in Content Types» Document Parsers in SharePoint - Part 1: Overview» Document Parsers in SharePoint - Part 2: How Parsers Process Documents» Document Parsers in SharePoint - Part 3: Parsers and Content Types» Document Parsers in SharePoint - Part 4: Parser Schema and Interface» Document Parsers in SharePoint Poster

Technorati : , ,

Comments powered by Disqus