Recursive Relationships in Databases

On December 5, 2011, in Databases, by murat

One of the tricky issues in entity relationship model is the recursive relationships. Sometimes an entity needs to refer to itself. For example, think of an organisational chart for the roles of employees in a company; each employee is managed by a manager, each manager manages 0 or more employees. The records of all the employees are kept in a table called tblEmployee:

-- T-SQL for MS SQL Server 2005
CREATE TABLE tblEmployee (
  employeeId INT NOT NULL,
  empName VARCHAR(50) NOT NULL,
  empSurname VARCHAR(50) NOT NULL,
  managerId INT NULL,
  PRIMARY KEY (employeeId)
)
GO
 
CREATE INDEX idxManager
ON tblEmployee (managerId)
GO
 
ALTER TABLE tblEmployee
ADD CONSTRAINT fkManager
FOREIGN KEY (managerId)
REFERENCES tblEmployee (employeeId)
GO

Check Alter Table syntax on MSDN.

In MySQL, you can do the same, the table can refer to itself:

http://stackoverflow.com/questions/1969338/mysql-foreign-keys-recursive

..as long as the engine is InnoDB.

Also, triggers can be used to achieve the same result.

Tagged with:
 

Adaptable Database Design

On October 30, 2011, in Databases, by murat

I had an opportunity to work on a Magento implementation and checked its database. I had heard about its adaptability to almost any online e-commerce business in almost any country. So, I had a chance to see how it is achieved on the database level.

For example, when we want a system which is adaptable and easy to configure, even a simple table of products becomes a headache because we do not know what will be stored exactly per business and per country, never mind the multilingual requirements.

So, a simple structure will not be sufficient.

CREATE TABLE product
(
    product_id INT,
    prd_code VARCHAR(20), 
    prd_name VARCHAR(100),
    prd_desc text,
    prd_price FLOAT,
    prd_avail_from DATE,
    prd_avail_to DATE,
    category_id INT
);

Apart from few very common attributes, we have so many unknowns. One solution to the problem is, like Magento developers did, is to create separate tables to define attributes and to store those attributes for each product.

CREATE TABLE attribute
(
    attribute_id INT,
    atr_name VARCHAR(30)
);
 
CREATE TABLE product_attribute
(
    product_id INT,
    attribute_id INT,
    atr_value VARCHAR(255)
);

That is the basic idea. They just created more tables for each data type such as int, decimal, datetime, text, etc.

This does not come without extra cost. You cannot use one simple SQL select query to get a list of products any more.

SELECT product_id, attribute_1, attribute_2, attribute_3
FROM product

You cannot even use this type of select query:

SELECT product_id
    (SELECT pa1.atr_value FROM product_attribute pa1
    WHERE pa1.product_id = p.product_id AND pa1.attribute_id = 1) AS attribute_1,
    (SELECT pa2.atr_value FROM product_attribute pa2
    WHERE pa2.product_id = p.product_id AND pa2.attribute_id = 2) AS attribute_2,
    (SELECT pa3.atr_value FROM product_attribute pa3
    WHERE pa3.product_id = p.product_id AND pa3.attribute_id = 3) AS attribute_3,	
FROM product p

Because we do not know the number of attributes and we cannot hard-code attribute IDs e.g. 1, 2 and 3.

In addition to the complexity of recording and managing simple attributes, the other issue is performance: the database is hammered with so many extra queries. That is why the developers create additional background processes to tidy up the database and re-index it regularly. That means the server will need more memory, faster CPUs and faster disks.

You can come with marvelous database designs for the problem you are facing. But the first thing you need to think is:

How often will the system change and require adaptation and configuration?

If your answer is “rarely” or “maybe, once a year” then you need to spend more time on the design, even re-consider the database management system you are using. Nowadays, there are new ones for this purpose such as: CouchDB and MongoDB.

Tagged with:
 

The aim is to write code efficiently, but without compromising easy debugging, maintenance, and especially adaptability to “possible” changes in future.

When I first started programming with Fortran (77), which had been used intensively for half a century in scientific and engineering applications, I learned that all variables had types: integer, real, double precision, complex, etc. If programming started with numeric computation, it required precision, and to do that we need to reduce unknowns.

If we are allocating a room to store data, we need to know the variations so that we can allocate room with proper size. We can not complain to the hotel after booking a room for 2 and arriving with 3 more friends!

I believe it (the chaos) started with variant types and/or pointers! Let me explain..

When I started learning Visual Basic (3.0) and that you do not have to define your variables and data types in certain modes, I was shocked! “Basically” variables of “variant” type were created/destroyed automatically on-the-fly, which may sound allowing so-called programmers to write quick and dirty pieces of code.

One of the best features of Visual Basic IDE was the auto-complete feature where it understood your variables as you write code and depending on the type/class it used to list the properties and functions after hitting the magic character “.” (the dot). The second feature was debugging: I could “step into” code line by line and watch variables.

I think majority of the bugs can be eliminated while you are writing the code. Dynamic languages come with advantages and disadvantages. It is good to have better adaptability but you need to reduce the unexpected run-time errors (exceptions). A programmer needs to expect the unexpected. The more exceptions you handle in your code the better it will run.

For example, PHP magic methods: __get(), __set() and __call()

If we do not explicitly define the properties of our class and simply use a hidden array to manage data dynamically; if we do not explicitly specify the functions (methods) of our class, then we can not expect much help from our favourite IDE (unless you hate IDEs and use notepad or vi, even worse).

When I found Eclipse PDT, I have gone back in time to 90s when I used Visual Basic! It can understand your data structures as you write code. An important note is that you need to help yourself and your team by writing PHPDoc comments for your variables, class variables, functions so that Eclipse can help you in return. For example:

class pc {
 
    /**
     * Serial number of a PC
     * @var string
     */
    public $serial_number;
 
    /**
     * Clock speed as MHz
     * @var int
     */
    public $speed;
 
}//end class
 
class programmer {
 
    /**
     * Name of a programmer such as 'Murat'
     * @var string
     */
    public $name;
 
    /**
     * PC of a programmer
     * @var pc
     */
    protected $pc;
 
    /**
     * Set PC of a programmer
     * @param pc $pc Note that PHP enforces the type of the input at run-time
     * @return void
     */
    public function setPC (pc $pc) {
        //todo
    }
 
    /**
     * Get PC of a programmer
     * @return pc
     */
    public function getPC () {
        //todo
    }
 
}//end class
 
//sample usage
$a_pc = new pc();
$a_pc->serial_number = '1976';
$a_pc->speed = '100';
 
$bill_gates = new programmer();//he was!
$bill_gates->name = 'Bill';
$bill_gates->setPC($a_pc);//on fire

Considering hundreds of classes in PHP frameworks and libraries these days, “->” (arrow) is a life saver in Eclipse.

Eclipse is one of the best donations of IBM to the IT community!

There are even more dangerous features in PHP, like Microsoft made millions of enthusiasts “programmers” BASICally, and presented office macros to hackers in silver plate, Zend created millions of web programmers and gave them eval and call_user_func_array together with SQL injections.

For example:

 
class my_app {
 
    static function main (){
        //read some settings from a file; a XML or even a YAML file these days
        $butler = find_my_butler();
        $task = read_your_task();
        $any_arguments = array();//no arguments please
        //just do something
        call_user_func_array ( array ($butler, $task), $any_arguments);
    }
 
}//end class

That is what I call: programming in the dark!

Simple solution is to define interfaces. Interfaces set the rules of communication between classes and systems. So, if we receive an input object which implements the methods in our interface, we will happily call the methods and know what it will likely do.

 
interface butler {
    public function call_my_driver ();
}
 
class chef implements butler {
    //maybe
 
    public function call_my_driver (){
        //do it
    }
}
 
class my_app {
 
    static function main (butler $a_butler){
        $a_butler->call_my_driver();
    }
 
}//end class
 
$a_chef = new chef();
my_app::main ($a_chef);

This is perfectly acceptable, and we are aware of what’s going on during design-time and run-time.

There are millions of PHP applications out there waiting to be fixed; and many “poor” company directors don’t even know what they are paying for when they are hiring/employing an “expert programmer”!

Happy coding!

Tagged with:
 

XAMPP is just another package of:

Apache HTTP Server,
MySQL,
PHP + PEAR + Zend Framework, phpMyAdmin,
Perl,
FileZilla FTP Server,
Mercury Mail Server,
OpenSSL,
Webalizer,
etc.

like WAMP, EasyPHP (the first of its kind I’ve known), and others:

http://en.wikipedia.org/wiki/Comparison_of_WAMPs

XAMPP 1.7.3 includes:
— Apache 2.2.14 (released in October 2009) and PHP 5.3.1 (released in November 2009)

XAMPP 1.7.4 includes:
— Apache 2.2.17 (released in October 2010) and PHP 5.3.5 (released in January 2011)

(…because my main concern is Apache and PHP)

Naturally, from time to time, we have to plan upgrading the web application environment. It is imperative to upgrade almost every year, if not every 6 months.

Our application is connected to Microsoft SQL Server 2005; so, we’ve been using php_mssql.dll and php_pdo_mssql.dll to enable us to use PDO.

Guess what? XAMPP 1.7.4 did not come with PHP libraries for SQL Server!

Immediate reactions:

+ reverting back to XAMPP 1.7.3
+ or manual installation of the latest Apache and PHP (PHP 5.3.6 is already released)

PHP 5.3.6 does not come with libraries for SQL Server either! I found out that mighty Microsoft decided to give better support for PHP by providing better libraries for “the” SQL Server:

http://msdn.microsoft.com/en-us/sqlserver/ff657782.aspx

You have to install native client libraries and download DLLs for PHP, such as thread-safe and compiled using VC9:

php_sqlsrv_53_ts_vc9.dll
php_pdo_sqlsrv_53_ts_vc9.dll

This lead to another critical issue: although we used PDO for SQL Server, our code got broken when we tried to use these new libraries. We understood that Microsoft finished implementing their PDO library without backward compatibility!

Here are some examples of features/functions that caused us headache:

PDO::beginTransaction
PDO::commit
PDO::rollBack

Since these were not available in the previous version of pdo_mssql, we had to use SQL commands:

BEGIN TRANSACTION;
COMMIT;
ROLLBACK;

Also, we had issues with PDO::query which included SQL commands that executed stored procedures in the database.

In short, we could not take the risk of discovering more broken pieces of code!

We have swiftly re-installed XAMPP 1.7.3!

What’s going to happen next?!

Tagged with:
 

PHP FastCGI on LightTPD

On March 4, 2011, in Internet, Programming, Web, Windows, by murat

Recently, I’ve been having issues with php-cgi.exe crashing on Apache 2.2 and IIS7 on Windows 2008. So, I’ve been searching for alternative ways of running my PHP application.

I came across nginx and lighttpd.

Let’s try LightTPD 1.4.28 and PHP 5.3.5.

Get the Windows binaries for LightTPD from WLMP project: http://en.wlmp-project.net/downloads.php. I chose the ZIP package with SSL support.

Get the Windows binaries for PHP from: http://windows.php.net/download/. I chose the ZIP package VC9 x86 Non-Thread-Safe.

Uncompress them into drive C: so that we can have C:\LightTPD and C:\PHP .

Go to C:\LightTPD\conf\
Edit lighttpd-srv.conf. Change file relative paths to absolute paths to read:

server.errorlog      = "c:/lighttpd/logs/lighttpd-srv.error.log"
accesslog.filename   = "c:/lighttpd/logs/lighttpd-srv.access.log"
server.pid-file       = "c:/lighttpd/logs/lighttpd-srv.pid"
include "c:/lighttpd/conf/lighttpd-tag.conf"
include "c:/lighttpd/conf/lighttpd-inc.conf"

Edit lighttpd-inc.conf.
Enable mod_fastcgi among server.modules and check following settings:

server.document-root = "C:/LightTPD/htdocs/"
server.upload-dirs = "C:/LightTPD/tmp/"
fastcgi.server = ( ".php" =>
    ("localhost" => ("host" => "127.0.0.1", "port" => 521 ) )
)

Go to C:\LightTPD\htdocs
Create phpinfo.php with content:

<?php phpinfo(); ?>

Go to C:\PHP.
Have a copy of php.ini-production as php.ini.

Run PHP CGI:

C:\PHP>php-cgi.exe -b 127.0.0.1:521

Run LightTPD:

C:\LightTPD>LightTPD.exe -f conf\lighttpd-srv.conf -m lib

Go to:
http://localhost/phpinfo.php

Voila!

It is definitely worth trying. Consider modifying the config files so that LightTPD can quickly serve static files such as HTML, CSS, JavaScript, JPEG, PNG files; and PHP files are passed to the CGI application or service for further processing. Also consider the process recycling settings, which is mainly affected by setting the number of requests served before the process is killed.

References:
http://php.net/
http://www.lighttpd.net/
http://redmine.lighttpd.net/wiki/1/TutorialLighttpdAndPHP

Tagged with: