This is one of a series of pages that present the basic principles of using HTML to create a user-friendly web site. This page is concerned with files: how to name them, how to get them onto your server, and how to organize them into directories.
Principles of friendly web pages
The structure of an
HTML
document
Files
Transferring files to the server
Directory structure in the server
Refinements: using additional features
Further refinements: improving accessibility
Trouble-shooting: why doesn’t it come out as expected?
A web page usually consists of more than one file, the main exceptions being very short pages that contain no images at all. For a typical page the HTML file that is initially requested by the user’s browser is just a starting point: each different image included in the page represents a GIF, PNG or JPEG file, and in addition most pages form components of groups of linked pages that link to one another, each of these representing a separate HTML file.
Although it is technically possible to organize a collection of information into a single HTML file within internal links only, no matter how long it is, this is not a good idea, because the longer an HTML file is the greater the chance that the user will lose interest before it is loaded. Suppose, for the sake of illustration, that the information can logically be organized into ten sections, each requiring about 15 kb of HTML code. Then if it is all contained in a single HTML file this will have a length of around 150 kb. Consider now a user interested in a piece of information contained in the last section, i.e. around 140 kb into the file. Even if the page begins with a contents list providing a link to the required information it is still necessary to wait for 140 kb of unneeded code to be transferred from the remote server, loaded into the browser and processed by it before the required link is found. This puts an unnecessary demand on the server and on the whole web, and it greatly adds to the time the user must wait. It is far better to break the 150 kb into ten smaller files – not necessarily of exactly 15 kb each, but organized according to the logical coherence of what they contain.
There are at least four points to be borne in mind when giving names to
HTML
and image files:
Choose meaningful names. The
HTML
file for the page you are
looking at at this moment is called files.htm
, but so far as your browser
and the whole system for transferring files across the web is concerned it
would fulfil its function just as well if it were called ZRX321.HTM
in the style of the names often
used in software libraries on main-frame computers. Nonetheless, files.htm
is a better name, because when the time comes to revise it it will be a great
deal easier to identify which file is which if a file that is about files is called
files.htm
.
Restrict lengths according the operating system. Regardless of what computer you use to prepare an HTML file in the first instance, it will eventually be stored on a server that will run under an operating system that may or may not be the same as that used on the first computer. Different operating systems have different naming conventions: some restrict names to eight characters before the extension, others do not require an extension and allow names of 31 or 127 characters. You need to know what restrictions apply on your computer before choosing names. As you cannot be certain that your files will always remain on the same server there is something to be said for choosing the most restrictive convention. Thus eight-character names will be acceptable on any system, and you will not need to change them if you later move them to a different server. (Remember that is not the renaming of files as such that will create problems; it is tracking down all the links to them in other files and changing them that will require time and effort.)
Choose name extensions according the operating system.
Although the most logical extension for an
HTML file name is
.html
or .HTML
,
and many web authors names their files accordingly, some operating systems insist
on three-chaacter extensions, and thus require names of
HTML
files to end in .htm
or
.HTM
. You
need to know the convention in use on your server and follow it. Notice that
it is your server, not your own computer, that is relevant here, though as you
will want to check your files on your own computer before uploading them to
the server you will probably need a naming convention that s compatible with both.
Similar onsiderations apply to JPEG
files: you may feel that .jpeg
or .JPEG
is the logical choice, but
in practice three-letters versions (.jpg
or .JPG
) are almost universal on the web.
Respect the capitalization conventions of the operating system.
Some operating systems treat capital and lower-case letters as equivalent,
some treat them as distinct. Thus one computer would regard Files.HTM
and files.htm
as different names for the same file, others as two different
files. You need to follow the conventions on your host server, but as a
general rule it is wise to use a single name consistently for any one file
even if the operating system allows variations, and to avoid giving
different files names that differ only according to the use of capital letters.
As long as your HTML files are stored only on your hard disk you can access them yourself with your browser, but no one else can do so through the web until they are copied to your local server. The actual mechanism for doing this will depend on numerous factors – the operating systems of the server and your own computer, the security policy in force on your local system, and the file transfer software available to you. You have no choice, therefore, but to discuss this aspect with your local administrator. You may be able to obtain a password that will give you access to a particular directory on the server to allow you to send or modify your files whenever you feel like, or you may need to work through your web master for every transfer.
If your web master insists, you may have no choice about the directory
structure of your server and no choice about where your files are stored.
If you do have a choice, and if you have been assigned a directory that you
can organize as you like (called mydir/ for the purposes of this illustration),
then an arrangement of the following type is useful:
|
mydir/ myfile1.htm myfile2.htm images/ myimage1.gif myimage2.gif myimage3.jpg test/ myfile1.htm myfile3.htm images/ myimage1.gif myimage3.jpg trash/ myfile2.htm myfile0.htm |
Main HTML directory A currently active HTML file Another currently active HTML file Sub-directory for images An image called by an HTML file Another image called by an HTML file Another image called by an HTML file Sub-directory for testing new files A new version of myfile1.htm A new file Images needed by the test files An image Another image Files to be thrown away An old version of myfile2.htm An old file no longer needed |
With this arrangement you store all of the HTML files intended for your readers at the top level, and all of the images that they require in a subdirectory images/. In addition, there is a sub-directory test/ with its own sub-directory images/. Use this for new or revised material that you want to check carefully before releasing it to the world: here it contains a new version of the file myfile1.htm as well as a file myfile3.htm that is not yet included at all in the main directory. Outside users will still be able to find it if they search deliberately, but they are unlikely to do this if there are no links from your normal pages. If you are anxious to keep prying eyes away from such material you can ask your web master to make this sub-directory invisible to the outside world. The sub-directory trash/ is for files that you intend to discard eventually, but it provides a safeguard against accidentally discarding a file that may prove important later on.