You are here

Guidelines for Designing Preservation-Friendly Websites

The design of your website largely influences our ability to preserve it successfully. Please keep these guidelines in mind--which were developed based on similar guidelines at the Library of Congress, Columbia University, and the Smithsonian Institution--when developing your website to ensure its reliable capture and rendering. To assess your website’s ability to be archived successfully, visit ArchiveReady. Complete this form to recommend a website for preservation by the University Archives.

  • Provide a standard link to all website content (including pages, images, videos, documents); links should be in HTML/XHTML format, rather than embedded in JavaScript or Adobe Flash.

  • Avoid proprietary formats (when possible) for important content, especially the home page- if you use a proprietary format make sure it is widely used and well documented (such as docx and PDF). Open standards and open file formats are generally the best choices for preservation.

  • Do not create home pages relying heavily on images or animations such as Adobe Flash, but if you do create such pages, also provide alternative text-only HTML versions.

  • Include a user and/or xml Sitemap; sitemaps providing links to all content in a website help to ensure that crawlers will capture an entire site.

  • Omit robots.txt exclusions or limit them to areas not needed for archiving (such as calendar functions and databases). Please note that web archiving requires crawling of stylesheets and images; please be sure that directories containing these files are not restricted.

  • Where feasible, ensure that objects (video, audio, etc.) are embedded within your site or page and not embedded in third-party websites; the Heritrix crawler is unable to crawl some content from certain third-party sites (Vimeo, among others).

  • Maintain stable URLs for particular content and redirect from old URLs to new URLs when necessary.

  • State the type of character encoding; use an HTML meta tag or XML docytype declaration to indicate the type of encoding that should be used for proper rendering of the webpage.