MAF User Manual
MAF saves a web page, or multiple web pages in several
tabs, to a single "web archive" file. MAF,
is an acronym for Mozilla
Archive Format, a free add-on for the free
Firefox and SeaMonkey browsers.
MAF provides other enhancements to the standard browser's save system. For a quick overview, see the features page.
Note: This manual covers the latest version of Mozilla Archive Format. Some features may not be available in the version from the Firefox Add-ons website.
Introduction
MAF
archives are a convenient cross-platform means to preserve web
pages. MAF
archives store all the text, images, and other resources of a web
page pages to a
single file. When a MAF web archive is moved or renamed, the saved pages are unchanged.
The MAF add-on can also convert pages saved in MAFF format pages to
other formats, such as to original web pages, or to the Microsoft MHTML
format used by Microsoft's Internet Explorer browser.
Saving web archives
MAF provides two new options of file type in the Save As dialog box:
- Web Archive, MAFF zipped
-
This option saves one or more pages inside a single Mozilla Archive Format File,
or MAFF archive. MAFF archives are compressed using the universal,
cross-platform ZIP specification for saving multiple files in one
archive.
MAFF archives can be opened in the browser.
If multiple tabs were saved, opening a MAFF archive opens all the tabs,
exactly the way they were when they were saved.
-
It is possible to view the original location from which the page was
saved. The contents of the archive, including any embedded media files, can be inspected and extracted using any
ZIP utility, such as the free 7-Zip.
The Mozilla Archive Format
extension generates MAFF archives using the fast, native ZIP
implementation embedded in the Mozilla browser. The resulting files are
usually smaller than the equivalent MHTML archives, and opening this
kind of file is faster.
-
However, Microsoft's Internet Explorer browser cannot open
MAFF files, so MAF can save web pages in the Microsoft format, MHTML,
also.
- Web Archive, MHTML
-
This web archive
format, also known as MHT, is used by Microsoft's Internet Explorer browser. This
option saves a single page inside a MIME HTML file, or
MHTML archive.
MHTML files are
encoded, not compressed. The encoding usually increases the size of the
saved media files compared to the original. At present, the contents
of an MHTML archive can be decoded by only a limited number of web
browsers, or by using special utilities. However, MHTL archive format has the advantage that it can
be shared with those who use only Internet Explorer.
Additional information saved in web archives
When you save a page as a web archive, the following additional information about the save operation is stored in the archive:
- The original location from which the page was saved. This normally matches what is displayed in the address bar of the browser.
- The date and time the page was saved.
- The title of the page, if present.
- The character set in effect at the time the page was saved.
If the character set was changed manually using the View » Character Encoding
menu item, the custom choice is remembered. This allows the document to
be displayed correctly when it is reopened from the archive, even if it
contains international characters.
If you re-save an already archived page to a different file, the save time and location from the original archive are preserved.
Opening web archives
After MAF is installed, web archives can be opened using the File » Open File... menu choice or using drag-and-drop, as with any saved web page.
Under Windows, MAF can also create file associations that open MAF web archives by double-clicking the file names in Windows Explorer.
Viewing information about archived pages
By
default, when you are displaying an archived page, an additional icon
appears in the address bar of the browser. You can click the icon to
display the following information about the archived page:
- The original location the page was saved from, if available.
- The date and time of the save operation, if available.
The
original location is a link. You can click it with the left mouse
button to open the original page in the same tab, or you can use the
appropriate key combinations to open the link in a new tab or a new
window.
From the popup panel with the information on the page, you can also access the Archives dialog, that provides additional information on all the archives that have been opened during the current browsing session.
The icon can also be displayed in the status bar. You can control the visibility and position of the icon from the interface preferences.
Integration with other extensions
One
of the key features of Mozilla Archive Format is that it integrates not
only with the core of the browser, but with other extensions as well,
to provide a smooth user experience.
Some of these
extensions, like UnMHT, must be installed separately; other extensions,
like Save Complete, are embedded and updated together with MAF.
- Multiple Tab Handler, by Shimoda Hiroshi
-
This extension adds a multiple selection interface and a new context menu to the Firefox tab bar.
MAF
integrates with the tab selection context menu and adds an entry to
save the selected tabs in an archive. For MHTML archives, MAF creates
multiple files, while for MAFF archives all the tabs are saved in a
single file.
- Save Complete, by Stephen Augenstein
-
The Save Complete extension is integrated with MAF, but must be enabled from the preferences.
This
extension replaces the system used by the browser to save complete web
pages. The new system correctly handles style sheets referencing image
files, that otherwise would not be saved causing some pages to appear
differently.
- File Title, by Pavel Cvrcek
-
The functionality of the File Title extension is also available from the MAF preferences.
This extension replaces the default file name suggested in the Save As dialog box with the title of the page being saved.
- Title Save, by gm
-
This extension is similar to File Title, but does not affect the default behavior and adds a new item in the File menu to use the title of the page instead of the file name in the Save As dialog box.
You can use the new command to save MAFF and MHTML archives too.
You
may install this extension if you want to selectively use the page
title instead of the original file name. In this case, ensure that the
default browser's naming strategy is selected in the MAF preferences, otherwise the title of the page might be used in all cases.
- UnMHT, by Arai
-
This extension adds new options in the File menu to save MHTML archives, providing also other advanced features.
If UnMHT is installed, you can continue to use MAF to create and open MAFF archives, while MHTML archives are opened with UnMHT.
Converting previously saved pages to other file formats
You
probably already have some web pages saved among your local files.
These pages are often stored as file / folder pairs (like Page.html and Page_files), and you may want to convert them to a web archive format for easier maintenance.
You
may also want to convert saved pages from a web archive format to
another, for example from MHTML to MAFF to save disk space or vice
versa to achieve compatibility with Internet Explorer.
Converting single pages
Converting
a single page that was previously saved locally is as easy as opening
the page in the browser and resaving it in another file format. The
Mozilla Archive Format extension handles the details of the conversion
process, and preserves the information about the original source, if
available.
When converting a web page that is not stored in an archive, the following information is preserved:
- The date and time of the original save operation is obtained from the local file's last modification time.
-
The original location
is obtained from the special comment some browsers embed in the page
when they save it. If not available, the local address is used.
If the page was saved with Internet Explorer, the original location is stored like <!-- saved from url=(0023)http://www.example.org/ -->.
If
the page was saved using the standalone or the integrated Save Complete
extension for Firefox, the original location is stored like <!-- Source is http://www.example.org/ -->.
If the page was saved using SeaMonkey or Firefox without the Save Complete extension, the original location is not available.
When
converting a web archive to another archive format, all the information
that is supported by the destination file format is preserved.
When
saving an archived page as a complete page outside of an archive, if
the integrated Save Complete extension is enabled, the original source
location is stored in a comment inside the saved page.
Converting multiple pages
If you have many saved pages that you want to convert to another file format, you can use the Saved Pages Conversion Wizard. You can start the wizard using the Tools » Mozilla Archive Format » Convert Saved Pages menu item. If the Mozilla Archive Format submenu is hidden, you must first enable it from the interface preferences.
The
wizard allows you to convert all the pages located in one folder,
optionally including all its subfolders, automating the task of opening
each page and saving it using another file format. When using the
conversion wizard, the following information must be considered:
- The wizard operates on multiple files, but the results for each file are equivalent to converting a single page by opening and saving it manually. The same information about the original location is preserved, and the fidelity of the resulting page depends on the destination file formats and the current preferences.
- For best results, it is recommended that you enable the Save Complete component before starting the conversion.
- If you want to convert from MHTML to another file format, like MAFF, and you have installed the UnMHT extension, you must disable it for the duration of the conversion process.
- The
wizard only operates on one page for each file. If you want to convert
from a multi-page MAFF archive to another file format, you should
extract the archive first, using an ordinary ZIP utility. If you want,
you can then convert the resulting complete web pages to MHTML using
the conversion wizard.
- If you are converting from a
web archive format, ensure you have enough free space in your temporary
folder, since the archives are normally extracted to the temporary
folder before conversion. If you need to convert many pages and don't
have enough free space, you may want to convert only some of them at a
time, and restart the browser between each conversion batch. You can
also move the temporary folder to a different drive in the advanced preferences.
- In some cases, the automatic conversion of complex web pages may fail. These pages may need manual conversion.
Selecting which files to convert
First,
you must select the source and destination file formats, and the source
folder to be sought for source files. You can decide to look in
subfolders of the selected folder or to convert only the files that are
placed directly inside the selected folder.
The selected
source format determines how the wizard will look for source files. The
MAFF and MHTML web archive formats are recognized by their extension,
respectively .maff and either .mht or .mhtml. Complete web pages are recognized because they have an associated support folder, for example Page.html and Page_files, but also Page (without extension) and Page_files. Web pages saved as single files, without support folders, are recognized by their extension only.
If
you are using your browser in a language other than English, the
recognition of additional support folder suffixes will be enabled. For
example, if you are using your browser in French, a support folder
named Page_fichiers can be recognized, in addition to Page_files.
If you previously saved pages using a browser in a different language
than the current one, the support folder names may not be recognized
correctly.
The selected destination format determines how
the wizard will assign the output file names. The extension in the
source file name, if present, is always replaced with the correct
extension for the destination file format. For MHTML, the advanced preferences determine whether the .mht or .mhtml extension is used.
The
next step consists in selecting the destination folder. If you want,
you can place the converted files in a different folder from the
original files. This is particularly useful if you are converting from
a read-only source, like a CD-ROM or a DVD. The original folder
structure is always preserved, so that if a source file is located in a
subfolder of the original folder, the converted file will be located in
a subfolder with the same name in the destination folder.
You
may also choose to place the converted files near the original files.
Each converted file will be placed in the same folder as its original,
with the same file name, but with a different extension. In this case,
you may want to move the original out of the way, by selecting a folder
that will be used as a bin for the original files that have been
successfully converted.
If you are converting from the MAFF file format and the use of the "jar:" protocol is enabled in the advanced preferences,
you will not be able to move the source files to another folder, since
the browser will lock the files in place until it is closed. If you
want to use this feature when converting from MAFF to another format,
you should disable the use of the "jar:" protocol for the duration of
the conversion process.
The conversion wizard will never
delete or overwrite the source files. Since the converted pages may not
be entirely faithful to the original, you should always keep a backup
of your source files available, even after a successful conversion.
Finally,
the source folder is scanned to locate the original files. Depending on
how many files are present in the source, this operation may require
some time. If you are working with large folder trees, you may want to
repeat the wizard multiple times, converting one subfolder at a time.
Before
the actual conversion begins, you have the option of fine-tuning your
selection, and you can verify that the source files have been
identified correctly. In the list of files, in addition to the source
file name, support folder name and subfolder, you may display other
columns like the full source, destination and bin paths.
If
for any reason the destination file or support folder is already
present, or if a file or support folder is already present in the
folder where the source file would be moved after conversion, the
source file name will appear in the list, but the selection checkbox
will be disabled. This often indicates that the source file was
converted successfully during a previous run of the wizard.
Completing the conversion
After you have selected the files to be converted, click the Finish button to start the conversion process. Depending on the number of files, this process may require some time.
You can cancel the conversion at any time by closing the wizard or by using the Back button. Canceling the operation may require some time.
When
the operation is finished, you can see the count of how many files have
been successfully converted and how many files failed. The icon near
each file name indicates its current status:
not selected,
already converted,
waiting for conversion,
currently converting,
conversion failed, or
conversion succeeded.
Detailed information about the reasons for conversion failures is available in the Error Console, accessible from the Tools » Error Console menu item.
If you are satisfied with the results, click the Finish button to close the window. You may also use the Back
button to retry the conversion process with the same settings, or to
change your selection and repeat the process with different folders.
Preferences
The
default settings in effect after installation are enough to allow
correct loading and saving of both MAFF and MHTML archives. To enable
or disable the integrated Save Complete extension, customize the
interface, or modify advanced aspects of page loading and archiving,
you can change the extension's preferences.
The preferences dialog can be accessed from the Tools » Mozilla Archive Format » Preferences menu item or from the button in the archive information popup. If the Mozilla Archive Format submenu or the icon to display the popup are hidden, you can still open the preferences dialog using the Options button in the Add-ons dialog, available from the Tools » Add-ons menu choice.
Main preferences
- When saving complete web page contents:
-
This
preference controls which method is used to find all the web resources
(images, subpages, ...) that are included in the web page being saved.
This step is preliminary to archiving all the resources in MAFF or
MHTML format.
You may change this preference if the
saved pages seem to be really different from their original version, to
achieve a better result.
- Use browser's standard save system.
(default) With this setting, the web pages are saved by the browser.
How much of the web page is actually saved depends on the version of
the browser being used.
- Preserve scripts and source using “Save Complete”. Allows more content to be saved, thanks to the integrated Save
Complete extension written by Stephen Augenstein. This save mode
attempts to preserve the dynamic features of the page by keeping all
the scripts and the original page source code, but content generated by
scripts may be missing from the resulting page. Note that if you also
have a standalone version of Save Complete installed, MAF will continue
to use the integrated one.
Improvements to Save Complete are
periodically included in new versions of MAF.
- Take a faithful snapshot of the page. This is the most accurate save mode, as it captures the current state
of the page and creates an exact replica, including the current values
of form fields, as well as video and audio embedded in the page. This
save mode works especially well for pages that make extensive use of
scripts and use dynamic technologies like AJAX.
The resulting page will
be static, as scripts are disabled by the save operation to preserve
the integrity of the result when it is displayed again.
Note that the selected component will be used not only when saving archives, but also when saving complete pages using the
File » Save Page As... » Save as Type: Web Page, complete menu choice.
- Create MHTML files fully compatible with other browsers
-
When
this preference is enabled, Mozilla Archive Format will create MHTML
files according to the original specification, allowing any browser to
open the archives correctly, even in case of very complex pages. If
this preference is disabled, MAF will generate a specific MHTML
variant, that will open much more quickly in Firefox or SeaMonkey, even
for very large documents, but that other browsers would not be able to
display with proper formatting if the saved page contains nested CSS
stylesheets or inner frames.
MAF is only able to
create compatible MHTML files using the integrated Save Complete
component. If the Save Complete component is disabled, only the
MAF-specific MHTML variant is available.
- For the suggested file name:
-
This preference controls which method is used to select the default file name in the Save As dialog box.
- Use browser's standard naming strategy
(default) - With this setting, the Mozilla Archive Format extension
does not alter the current behavior, which is determined by the browser
or by other installed extensions.
If no other extension affecting this behavior is installed, the
original name of the file will be preferred to the title of the page.
- Use the title of the page whenever possible
- With this setting, the title of the page will be preferred to the
original file name. This is done for all HTML and XHTML pages, unless
the server you are downloading the page from explicitly asked the
browser to use a specific file name. Note that if other extensions affecting this behavior are installed, this setting may not work as expected.
- Save extended metadata in MAFF archives
-
With
this preference enabled, additional page information such as history,
text zoom and scroll position is saved for each page. There is
currently no preference to restore this saved information as yet.
Interface preferences
- Show Mozilla Archive Format icon in:
-
You
can control the visibility and position of the icon that provides
access to the additional information about an archived page.
- Address bar
- Always display the icon in the address bar. If the current page is
not saved in an archive, the icon is grayed out, but you can still use
it to access the Archives dialog or the preferences.
- Address bar, for archived pages only (default) - Keep the icon hidden during normal browsing, and display it only when viewing a page that is saved in an archive.
- Status bar
- Display the icon in the status bar. If the current page is not stored
in an archive, the icon is grayed out, but you can still use it to
access the Archives dialog or the preferences. This option is
recommended if you are using a theme that is not compatible with
additional icons in the address bar.
- None - Do not display the icon. Note that if you hide the icon, you can still access the MAF preferences from the Tools menu or the Add-ons dialog.
- Show Mozilla Archive Format menu items in:
-
You can select which menus will display the Mozilla Archive Format items. If you disable the Tab Bar Context Menu option but enable the Page Context Menu option, the tab-related menu items will appear in the page context menu instead of the tab bar.
- Show these additional menu items:
-
The Save In Archive option enables displaying the additional Save Page In Archive and Save Frame In Archive menu items near the Save Page As and Save Frame As standard items, across all menus. These items open a special Save As dialog reserved for saving in archives, and are useful if you routinely use the standard Save As
dialog to save only the text of a page, and need a separate option to
save a page in an archive without changing the selection in the file
type drop down list.
The Save Page In Archive and Save Frame In Archive menu items are always available under the Tools » Mozilla Archive Format menu, if visible, regardless of this preference.
File associations
- File associations
-
This preferences pane allows you to create or refresh file associations on Windows, for the MAFF and MHTML formats separately.
File
associations are always created explicitly for the current user of the
system. In addition, if the current user has administration privileges,
default file associations for all users are also created.
File associations are not removed when uninstalling MAF or the browser itself.
Advanced preferences
- Temporary folder
-
This preference allows you to customize the location of the temporary files required to open and save the web archives.
If unspecified, this location defaults to the maftemp subdirectory of the system temporary directory.
If
customized, the absolute path to the specified location is remembered.
The contents of the selected folder will be lost if the Clear temporary folder when browser exits
option is selected. There is usually no need to customize the temporary
folder unless you use different browser profiles on the same computer
at the same time.
- Clear temporary folder when browser exits
-
This
option is enabled by default. If disabled, the contents of the
temporary directory are preserved after the browser exits, and must be
cleaned up manually.
- Rewrite absolute URLs in open archives
-
With
this preference enabled, the archived web pages would be processed as
they finish loading in tabs. The processing would replace absolute
links to resources in the page with local resources in the archive (if
possible). This would allow users to browse linked pages in an archive
seamlessly.
- Use the "jar:" protocol to access the contents of MAFF archives
-
If
this preference is enabled, when you open a MAFF archive its contents
will be accessed directly using the "jar:" protocol, without being
extracted.
However, if you enable this option, the
archive files you open will be locked, and you will be unable to move,
rename or delete them until the browser is closed.
- Save using the .mhtml file extension instead of .mht by default
-
If this option selected, and you do not type a file extension in the Save As dialog box or file extensions are hidden, the complete .mhtml extension will be appended to the file name of MHTML archives, instead of the more common .mht extension.
- Display welcome window at next startup
-
The
welcome dialog is usually displayed only when MAF is installed for the
first time. All the options that are set by the welcome dialog are also
available from the preferences, thus it is usually not necessary to
display the dialog again. This option is only provided as a convenience
for translators in need of proofreading the text in the welcome dialog
multiple times.
Internal configuration settings
These configuration settings are not available from the preferences dialog, but only from the about:config
page. Usually, they should not be changed unless there is a specific
reason, and non-default settings may adversely impact functionality or
performance.
- extensions.maf.open.maff.ignorecharacterset
-
When
this setting is enabled, the character set specified for pages saved
inside MAFF archives is ignored, instead of being enforced when the
page is displayed. Enabling this option may be useful to troubleshoot
internationalization issues, but will cause saved pages to be displayed
incorrectly in most cases.
- extensions.maf.save.maff.compression
-
Controls the compression level to use when saving files in a MAFF archive.
- dynamic (default) - Use maximum compression for all files, but do not re-compress media files.
- best - Use maximum compression for all files.
- none - Store all the files uncompressed.
More documentation
This
document provided the essential user documentation for the extension.
Technical documentation about the internals of Mozilla Archive Format
and the MAFF file format are available in separate documents, the API documentation and the MAFF specification.