by ivan at 2013-08-10
The .htaccess
file is a very important file.
Allow us to do stuff before the request starts to be processed. The
content of this file can be included as a part of the
virtualhost
item in your
Apache
configuration, but sometimes we
don't have access to touch this files and
.htacces
offers an alternative
for do this things.
But...what are the things that we are talking about? Ok, yes, it's true, I offers you a meeting with a guy but you only know his name. But this have a solution. We are talking about:
.htaccess
file or to a resource that
have more depth in the File System.
mod_rewrite.c
module.
It's possible that this is the most used functionality of
the .htaccess
file. It's very
simple redirects the traffic of one URL to another but also
we take in account that this redirection isn't the only that
we can manage via .htaccess
. What?
There are two types of redirections: external redirections and internal redirections. The first ones, creates a new request and, the second ones, maps the traffic to one place to another without the need of request again. Confusing? Below I explain it better.
As we said before, an external redirection is say to the browser that the current request tries to go to a place that isn't correct, but we need which is the new place that have the requested content. For this, we say to the browser "go here and your user will find the content that him likes". This is a very simple concept to understand, I think but, why the user is requesting an URL that doesn't exists? There are some reasons but if we talk about percentages, we obtain two:
Ok, we know now two of the multiple possible reasons that can create a request to an URL that doesn't exists now. Well, which are the things that we like to know for redirect this traffic to the new URL's?
We need to know about request codes. When in the
TCP/IP
protocol we request for something,
we need to return a request code that explains to the browser
which is the status of this request. By default, the request code
is 200, that is the OK code. There lots of different codes
(you can view it all
here),
but in this article we center our efforts in only one: 301.
301 is the external redirect by excellence. It's the most used and the most usefull. Why? Because with the 301 code (Moved Permanently) we say to the browser that this content is now in another place and, with this, we say also that the previous SEO attributes of the older page, needs to be passed to the new one.
If we have and old domain and we need to send all the traffic to the new one, if we had a section in our website that was removed but have a good position in searching engines...301 is our solution. We can say to the browser and to the robots that all of this set of pages now are one (for example the home page) or some (and map one to one or many to one), and ensure that the quality of our page isn't losed.
Internal redirections are a little bit more "complex". Not too much, but we need to understand which is the default behaviour of a website.
All the websites in the world, by default, have a folder architecture similar to the directory system of any operating system. This means that if we request this page:
http://www.mydomain.com/article/posts/example.html
We are trying to access to the example.html
file that are in the root folder article
and inside the folder posts
. Ok, this
is perfect but...what happens when our website isn't composed only
by static files? or what happens when we like to use URL's without
extensions? We have an issue.
For solve this situation, we can use the internal redirections. This technique allow us to map any URL with a file, saying that this file can resolve this URL or set of URL's. And when we say this, we offer the result without the need of create a new request. Why? Because if we create a new request, we need to enable a file that can solve this problem and if we said that we don't like to put extensions in out URL's...we can't do it or we can start a very funny infinite loop of redirections, but anyone like this for their website, right?
Well, now we know the theory, but we need to know how implement our redirection solution. And as the redirections, we have two solutions: one to one and many to one.
Sometimes, when we change our website URL's, the mapping between the old URL and the new URL is one to one, for every old URL we have a new one. For say to the browser that we like to response the old URL with the new one, we need to write the next statement.
As we can view, we can use this type of redirection for all the
types of URL's (with extension, without extension...) and if
we need to set another type of response code, we only need to
change 301
for the status code that
we like to send. Isn't more difficult that put one line for
every URL that we like to redirect.
The One to One redirection is a good solution but sometimes we have lots of URL's and put one line for every URL can be very hard. For this, exists the Many to One redirections. We called Many to One, but the real meaning is we set a rule and all the URL's that match with this rule goes to one page that can have also params.
Said this it's possible that every URL goes to a different URL or it's possible that some URL's match the same rule and if this URL doesn't have params, all of them goes to the same URL.
This type of redirections can be writted with different syntaxes,
but in this article, we will talk about one that we consider that
can cover all your needs. We are talking about the combination of
the RewriteCond
and
RewriteRule
functionalities.
RewriteCond allow us to set conditions that affect to our redirection rules. We can us a set of conditions for the same rule, creating a complex rule. The syntax of this types of statements are very simple, but it's a little bit tricky.
For explain it more easy, we start with an example
In this example we can view that we have 3 conditions. All of them have the same syntax, that we can define it as:
RewriteCond Variable
Value
[Flags]
Ok, we have that we starts every time with
RewriteCond
and then we have three
custom fields
There are some variables that have their own literals, like the %{REQUEST_FILENAME}, that their values can be the name of the file that we are requesting or the type of this file, that can be defined with:
As you view, there are lots of possible values
and every variable have their owns. This article
doesn't pretend to describe all, only talk about
the .htaccess
file and
their basic functionality and configuration.
-strmatch
that
needs to be continued by an expression before the flags.
Flags aren't necessary. By default, the flag have a
value of [AND]
, that isn't
removed if we don't set the flag to
[OR]
. We can set more than
one flag separating it via comma. The flags that
we can define are:
With this, we are ready to start to define the
RewriteRule
functionality. After define some
RewriteCond
statements
(if we like, we can't define any condition and
operate directly with the RewriteRule
),
we put a RewriteRule
statement, that
have the next syntax.
RewriteRule Pattern
Result
[Flags]
As we can view, is very similar to the
RewriteCond
syntax, but have
some differences. In this case, the definition of the
different parts are:
The pattern section is a regular expression that defines which one of the matches URL's are processed by this rule.
With a literal or another regular expression, the
result of the rewrite. If result is
-
, means do nothing.
As in in the RewriteCond
statement, aren't necessary and we can concat as
many as we like separating it via comma. The most
used are:
[OR]
statement in
RewriteCond
. If
this rule doesn't match goes to the next
one.
R=301
, the
new request have the
request status code that we
put after the equal symbol.
RewriteCond
statements
that we defined before.
Ok with this we have all the information that we think that are necessary. Now, some explained examples for complement this information.
Redirect all the traffic of one domain to another domain
Redirect all the traffic of one domain to one file
Redirect all the traffic that doesn't exists as the file-system directory behaviour
If you have more questions about this types of redirections,
you like to know more about all the posibilities that offer
this Apache
module,
here
you have the oficial documentation.
Another common use of this file is for define access
control via user and password to some parts of our
website. For do it, we need to add this lines in our
.htaccess
file.
As we can view, we have a
AuthUserFile
.
What type of file is this file?
The .htpasswd
file is a file that contains users and
passwords. For generate it, we need to
execute the next line:
htpasswd -c /path/to/.htpasswd username
After this, the console solitice us the
password for the user
username
and when
we type it, the
.htpasswd
file is
generated with this user. If we need to
add more users, we need to type the next
line for every one, and type the password
when the prompt appears.
htpasswd /path/to/.htpasswd username
If we type a username that is already setted, we change their password.
In the configuration lines that we put before,
we can view a line that puts
Require valid-user
.
This means that any user in the file is
valid for entry in this section. If we like
to select only many users of all the users
that are in the file, we can type their
usernames separated via comma.
mod_headers.c
module.
The .htaccess
file
can also be used for change some headers of
our requests. For do it...we need to add
a code like this:
This example set a header max-age depending of the file extension of the file that it's being requested. If we need to set another header, we only select which one we like and what's their value. here you have the entire list of the possible headers that have the HTTP protocol.
mod_expires.c
module.
When we work with Google Page Speed, this is one
of the typical things that says that we don't
have implemented yet. Browsers needs to know
how many time we like that they conserve our
static resources and doesn't request again
for theirs. Below, an example of set this
information in the .htaccess
file.
In this section, two easy and useful functionalities
that offer the .htaccess
file and we think that all we need to have in
our .htaccess
files.
mod_rewrite.c
module.
If we write the next lines in at the begging of
our .htaccess
file,
we compress via .gzip algorithm all our requests,
reducing with that, the weight of our them.
mod_deflate.c
module.
Another useful functionality. If we put the next lines
at the beggining of our .htaccess
file, we remove the unnecessary white spaces and break lines
of our html, css and javascripts files, reducing with that
the weight of our requests.
That's all folks! With this we end this entry. If you have more questions, if you like to send a subject that you think that is interesting that we talk about it, please, fill our contact form and we try to reply as soon as possible.
Thanks for your attention and we hope to view you again in the next article!
Apache
,
we need to have console access to the server. If you don't have this
type of access, you need to talk with your System Adminstrators.
However, if yo have access, it's very simple to do it by own.
You only need to execute this line for every module that you
need to install and enable:a2enmod name_of_the_module
mod_
and without the extension
.c
. Then, if you need to enable
the mod_rewrite.c
, you need to write:a2enmod rewrite