Format Alignment: File Groups

Hi everyone,
While creating our database in only one disc(C or D), suppose that we create more than one file group in order to group our data files. However, in this situation; I wonder that whether it brings any benefit or advantage to us.

Also, I wonder that why we always have to put our data file into separate file group if we use separate discs for data files. Is not it allowed to use only one file group even if we use separate dics ?

Would you explain these to me ?

Thanks

If youre on one disk, there's not much difference. If you have 2 disks, putting log on separate disk is the most common and efficient solution in most cases.
For a typical database application, the best configuration of hardware and an average one are not likely to differ dramaticaly in terms of performance. Performance tuning is almost 90% done at SQL and application layers.

But putting log on separate disk always helps ;)

|||

Thanks for your reply but it seems to me a little bit confusing.
Would you mind giving some examples about the efficiency and high level performance of File Groups ?

Thanks

|||

Database transaction typically write some data to the disk, and some data to the log. This means constant log file access for writing. Meanwhile, when there are no failures, this writes can be performed sequentially in large chunks. This dramatically decreases disk access time, because significant part of time that disks need to perform an operation(read or write) consumes "seeking" disk head to the desired position, because this is mechanical operation. Sequential writes avoid seeks.
This is the main thing I could think of given no details. In a rare case if your application is very read-intensive (OLAP), then you might consider partitioning data files among disks to fasten reads.

In a mixed (OLTP) application, consider doing the following, depending on how many disks you have:

A separate disk for all log files, maybe with redundancy(mirroring).
Then, consider separating secondary (non-clustered) indexes from data files.
Then consider separating data files.

|||

Hi,
First of all, thanks for your all replies and thanks for spnding your time for my questions.
I think that I finally kept the idea about the efficiency and high level performance which I want to demonstrate.
For instance, suppose that we recall some records from two different table in one databases and we do this by using inner join. And if we put this two table into different file groups which are located in different discs, then it is really more efficient to call the column of these tables than other situation, is not it ?

Briefly, we do this application in order to share the whole power to the to the discs so that at one time, we can do many things faster than before.

Am I right or wrong ?

Thanks

|||

merit,
Your scheme of splitting data among disks works perfectly with you example provided, that is, when overhelming part of transactions are read-only. Otherwise, if you have read-write (OLTP) application and only 2 disks, just put the data in one disk and log to the other.

|||

Hi,

Again thanks for your reply.

While I was reading a tutorial abou this topic, I heard that The CREATE DATABASE statement is a multithreaded process that creates one process thread for each logical device.

What does this mean ?

Would you please explain it to me ?

|||

Hi,

Another good and significant example about the efficiency of using separate file group in different discs is SELECT INTO statement. Because, while performing this , you both read data from a table and then write this data to another table. Thus, if we try to use different discs for both reading data and writing data, we will have increased the performance and speed of the program.

Menawhile, the another reason for this is that while performing some operations, every disc are used by the system at the same time, is not it ?

Thanks,

Mert

|||

I guess not.

The main point is that, as I mentioned earlier, when you write into some table, you simultaneously write into log too. In this case, it's much better to put data files to one disks and put log to separate disk. Never forget that log is intensively used during write transactions. In your example, you completely do not consider log writes.

|||

Hi,
You suggest me to put log to different discs which I also appreciate since it is obvious that this brings performance benefit to the dics.
But at the following link , it supports what I said about SELECT INTO.
Reads from one discs and write to another table which is located in different file group in different disc, so do you really think that this does not increase the performance ?
Suppose that you performs both operations from the same discs. So while doing this, it is likely that your disc drive spends more power to do it and more time to do it.
So I would like to learn the main reason why you think that only, putting log files into different disc can bring high level performance.

http://www.dell.com/content/topics/global.aspx/power/en/ps2q03se_rosetta?c=us&cs=555&l=en&s=biz

Thanks,
Mert

|||

That's all correct.
In the link you provided those guys featured the following disk configurations:

<Quote>Storage consisted of two EMC^? CLARiiON^? FC4700 storage arrays and two EMC ^? 8530 storage arrays. Each of the EMC CLARiiON FC4700 storage arrays had a 4 GB write-read cache, while each EMC Symmetrix 8530 storage array had a 16 GB write-read cache. Each of the four storage arrays included 96 disks capable of 73 GB and 10,000 rotations per minute (RPM), for a total of 28 TB of raw disk space (12 TB after RAID-1+0 configuration).<Quote>

It is obvious that they had put log into other drive too, see it here:

<Quote>The Rosetta Genomics application database files totaled 10.35 TB for data files and 135 GB for the log file. Administrators defined both a primary, or default, file group and a user-defined file group. The default file group had one data file containing database system objects. Located on the default drive C, the default file group shared drive space with SQL Server binaries and the operating system. The user-defined file group was configured with 40 fixed-size 131 GB data files on CLARiiON arrays and 40 fixed-size 135 GB data files on Symmetrix arrays, for a total of 80 data files?one for each LUN designated for database file storage. Mount points were configured at 132 GB on CLARiiON arrays and 136 GB on Symmetrix arrays.<Quote>

Indeed, if you have SUCH storage capacitites, you can put everything on different drives. In my earler post I wrote that if you have ENOUGH disks you can split data files too. Anyhow, given TWO disks and a WRITE-INTENSIVE application, it is almost always the best practice to separate log from data files.
Once again, if you have such opportunity, indeed, split data files too! I was all about priorities, and the wonderful article you provided says nothing about putting log into one disk and distributing data.
I'd reccomend you the wonderful book of D. Shasha and P. Bonnet, "Database Tuning" for very thorough discussion of the topic.
http://www.amazon.com/gp/product/1558607536/sr=8-1/qid=1151403308/ref=pd_bbs_1/103-2557871-6722269?ie=UTF8

P.S.
Applications with extremely large data storage requirements very often mean that the application is OLAP one, that is, a read-intensive application, when everything I said should be reversed. In those applications, log file is no more a hot spot.

P.P.S. merit, I am just too curious not to ask, is your disk configuraton something (even remotely) close to the configuration in the article?

|||

Thanks for your interest to this thread,

Andranik Khachatryan

For your question, I ask everything since I am also curious about them, not for application on my computers or any problem that I encounter.

Meanwhile, I am studying to the MCAD exam that's why I ask lots of questions recently.

Best wishes,

Mert

|||

Hi,
There something make me still confused about this topic.
For example, when we call some columns from a table with Select query statement, do the SQL server do this by first finding the file group that this table located in and then finding the corresopnding table ?

Thanks

Sunday, February 19, 2012

File Groups

No comments:

Post a Comment

Format Alignment

Blog Archive

About Me