I had an opportunity to work on a Magento implementation and checked its database. I had heard about its adaptability to almost any online e-commerce business in almost any country. So, I had a chance to see how it is achieved on the database level.

For example, when we want a system which is adaptable and easy to configure, even a simple table of products becomes a headache because we do not know what will be stored exactly per business and per country, never mind the multilingual requirements.

So, a simple structure will not be sufficient.

CREATE TABLE product
(
    product_id int,
    prd_code varchar(20), 
    prd_name varchar(100),
    prd_desc text,
    prd_price float,
    prd_avail_from date,
    prd_avail_to date,
    category_id int
);

Apart from few very common attributes, we have so many unknowns. One solution to the problem is, like Magento developers did, is to create separate tables to define attributes and to store those attributes for each product.

CREATE TABLE attribute
(
    attribute_id int,
    atr_name varchar(30)
);

CREATE TABLE product_attribute
(
    product_id int,
    attribute_id int,
    atr_value varchar(255)
);

That is the basic idea. They just created more tables for each data type such as int, decimal, datetime, text, etc.

This does not come without extra cost. You cannot use one simple SQL select query to get a list of products any more.

SELECT product_id, attribute_1, attribute_2, attribute_3
FROM product

You cannot even use this type of select query:

SELECT product_id
    (SELECT pa1.atr_value FROM product_attribute pa1
    WHERE pa1.product_id = p.product_id AND pa1.attribute_id = 1) as attribute_1,
    (SELECT pa2.atr_value FROM product_attribute pa2
    WHERE pa2.product_id = p.product_id AND pa2.attribute_id = 2) as attribute_2,
    (SELECT pa3.atr_value FROM product_attribute pa3
    WHERE pa3.product_id = p.product_id AND pa3.attribute_id = 3) as attribute_3,	
FROM product p

Because we do not know the number of attributes and we cannot hard-code attribute IDs e.g. 1, 2 and 3.

In addition to the complexity of recording and managing simple attributes, the other issue is performance: the database is hammered with so many extra queries. That is why the developers create additional background processes to tidy up the database and re-index it regularly. That means the server will need more memory, faster CPUs and faster disks.

You can come with marvelous database designs for the problem you are facing. But the first thing you need to think is:

How often will the system change and require adaptation and configuration?

If your answer is “rarely” or “maybe, once a year” then you need to spend more time on the design, even re-consider the database management system you are using. Nowadays, there are new ones for this purpose such as: CouchDB and MongoDB.