Site map > Friendly HTML > Files

Creating a Web Page: a Basic Guide

Organizing a Set of HTML Files

This is one of a series of pages that present the basic principles of using HTML to create a user-friendly web site. This page is concerned with files: how to name them, how to get them onto your server, and how to organize them into directories.

Contents

* Principles of friendly web pages

* The structure of an HTML document

* Specifying the layout

* Specifying the typography

* Linking with other locations

* Images

* Files

* File names

* Transferring files to the server

* Directory structure in the server

* Style sheets

* Refinements: using additional features

* Further refinements: improving accessibility

* Trouble-shooting: why doesn’t it come out as expected?

* Detailed list of contents

Organization of web files

A web page usually consists of more than one file, the main exceptions being very short pages that contain no images at all. For a typical page the HTML file that is initially requested by the user’s browser is just a starting point: each different image included in the page represents a GIF, PNG or JPEG file, and in addition most pages form components of groups of linked pages that link to one another, each of these representing a separate HTML file.

Although it is technically possible to organize a collection of information into a single HTML file within internal links only, no matter how long it is, this is not a good idea, because the longer an HTML file is the greater the chance that the user will lose interest before it is loaded. Suppose, for the sake of illustration, that the information can logically be organized into ten sections, each requiring about 15 kb of HTML code. Then if it is all contained in a single HTML file this will have a length of around 150 kb. Consider now a user interested in a piece of information contained in the last section, i.e. around 140 kb into the file. Even if the page begins with a contents list providing a link to the required information it is still necessary to wait for 140 kb of unneeded code to be transferred from the remote server, loaded into the browser and processed by it before the required link is found. This puts an unnecessary demand on the server and on the whole web, and it greatly adds to the time the user must wait. It is far better to break the 150 kb into ten smaller files – not necessarily of exactly 15 kb each, but organized according to the logical coherence of what they contain.

File names

There are at least four points to be borne in mind when giving names to HTML and image files:

  1. Choose meaningful names;
  2. Restrict lengths according the operating system;
  3. Choose name extensions according the operating system;
  4. Respect the capitalization conventions of the operating system.

Choose meaningful names. The HTML file for the page you are looking at at this moment is called files.htm, but so far as your browser and the whole system for transferring files across the web is concerned it would fulfil its function just as well if it were called ZRX321.HTM in the style of the names often used in software libraries on main-frame computers. Nonetheless, files.htm is a better name, because when the time comes to revise it it will be a great deal easier to identify which file is which if a file that is about files is called files.htm.

Restrict lengths according the operating system. Regardless of what computer you use to prepare an HTML file in the first instance, it will eventually be stored on a server that will run under an operating system that may or may not be the same as that used on the first computer. Different operating systems have different naming conventions: some restrict names to eight characters before the extension, others do not require an extension and allow names of 31 or 127 characters. You need to know what restrictions apply on your computer before choosing names. As you cannot be certain that your files will always remain on the same server there is something to be said for choosing the most restrictive convention. Thus eight-character names will be acceptable on any system, and you will not need to change them if you later move them to a different server. (Remember that is not the renaming of files as such that will create problems; it is tracking down all the links to them in other files and changing them that will require time and effort.)

Choose name extensions according the operating system. Although the most logical extension for an HTML file name is .html or .HTML, and many web authors names their files accordingly, some operating systems insist on three-chaacter extensions, and thus require names of HTML files to end in .htm or .HTM. You need to know the convention in use on your server and follow it. Notice that it is your server, not your own computer, that is relevant here, though as you will want to check your files on your own computer before uploading them to the server you will probably need a naming convention that s compatible with both. Similar onsiderations apply to JPEG files: you may feel that .jpeg or .JPEG is the logical choice, but in practice three-letters versions (.jpg or .JPG) are almost universal on the web.

Respect the capitalization conventions of the operating system. Some operating systems treat capital and lower-case letters as equivalent, some treat them as distinct. Thus one computer would regard Files.HTM and files.htm as different names for the same file, others as two different files. You need to follow the conventions on your host server, but as a general rule it is wise to use a single name consistently for any one file even if the operating system allows variations, and to avoid giving different files names that differ only according to the use of capital letters.

Transferring files to the server

As long as your HTML files are stored only on your hard disk you can access them yourself with your browser, but no one else can do so through the web until they are copied to your local server. The actual mechanism for doing this will depend on numerous factors – the operating systems of the server and your own computer, the security policy in force on your local system, and the file transfer software available to you. You have no choice, therefore, but to discuss this aspect with your local administrator. You may be able to obtain a password that will give you access to a particular directory on the server to allow you to send or modify your files whenever you feel like, or you may need to work through your web master for every transfer.

Directory structure in the server

If your web master insists, you may have no choice about the directory structure of your server and no choice about where your files are stored. If you do have a choice, and if you have been assigned a directory that you can organize as you like (called mydir/ for the purposes of this illustration), then an arrangement of the following type is useful:

    mydir/
       myfile1.htm
       myfile2.htm
       images/
           myimage1.gif
           myimage2.gif
           myimage3.jpg
       test/
           myfile1.htm
           myfile3.htm
           images/
               myimage1.gif
               myimage3.jpg
       trash/
           myfile2.htm
           myfile0.htm
    Main HTML directory
       A currently active HTML file
       Another currently active HTML file
       Sub-directory for images
           An image called by an HTML file
           Another image called by an HTML file
           Another image called by an HTML file
       Sub-directory for testing new files
           A new version of myfile1.htm
           A new file
           Images needed by the test files
               An image
               Another image
       Files to be thrown away
           An old version of myfile2.htm
           An old file no longer needed

With this arrangement you store all of the HTML files intended for your readers at the top level, and all of the images that they require in a subdirectory images/. In addition, there is a sub-directory test/ with its own sub-directory images/. Use this for new or revised material that you want to check carefully before releasing it to the world: here it contains a new version of the file myfile1.htm as well as a file myfile3.htm that is not yet included at all in the main directory. Outside users will still be able to find it if they search deliberately, but they are unlikely to do this if there are no links from your normal pages. If you are anxious to keep prying eyes away from such material you can ask your web master to make this sub-directory invisible to the outside world. The sub-directory trash/ is for files that you intend to discard eventually, but it provides a safeguard against accidentally discarding a file that may prove important later on.