(Paper) MySQL Interview Questions And Answers Set - 9
MySQL Interview Questions And Answers Set - 9
MySQL - Speed of UPDATE Queries
?
Update queries are optimized as a SELECT query with the additional overhead of a
write. The speed of the write is dependent on the size of the data that is being
updated and the number of indexes that are updated. Indexes that are not changed
will not be updated.
Also, another way to get fast updates is to delay updates and then do many
updates in a row later. Doing many updates in a row is much quicker than doing
one at a time if you lock the table.
Note that, with dynamic record format, updating a record to a longer total
length may split the record. So if you do this often, it is very important to
OPTIMIZE TABLE sometimes.
MySQL - Speed of DELETE Queries
?
If you want to delete all rows in the table, you should use TRUNCATE TABLE
table_name.
The time to delete a record is exactly proportional to the number of indexes. To
delete records more quickly, you can increase the size of the index cache.
MySQL - Other
Optimization Tips
Unsorted tips for faster systems:
Use persistent connections to the database to avoid the connection overhead. If
you can't use persistent connections and you are doing a lot of new connections
to the database, you may want to change the value of the thread_cache_size
variable.
Always check that all your queries really use the indexes you have created in
the tables. In MySQL you can do this with the EXPLAIN command.
Try to avoid complex SELECT queries on tables that are updated a lot. This is to
avoid problems with table locking. The new MyISAM tables can insert rows in a
table without deleted rows at the same time another table is reading from it. If
this is important for you, you should consider methods where you don't have to
delete rows or run OPTIMIZE TABLE after you have deleted a lot of rows.
Use ALTER TABLE ... ORDER BY expr1,expr2... if you mostly retrieve rows in
expr1,expr2.. order. By using this option after big changes to the table, you
may be able to get higher performance.
In some cases it may make sense to introduce a column that is 'hashed' based on
information from other columns. If this column is short and reasonably unique it
may be much faster than a big index on many columns. In MySQL it's very easy to
use this extra column: SELECT * FROM table_name WHERE
hash=MD5(concat(col1,col2)) AND col_1='constant' AND col_2='constant' For tables
that change a lot you should try to avoid all VARCHAR or BLOB columns. You will
get dynamic row length as soon as you are using a single VARCHAR or BLOB column.
It's not normally useful to split a table into different tables just because the
rows gets 'big'. To access a row, the biggest performance hit is the disk seek
to find the first byte of the row. After finding the data most new disks can
read the whole row fast enough for most applications. The only cases where it
really matters to split up a table is if it's a dynamic row size table (see
above) that you can change to a fixed row size, or if you very often need to
scan the table and don't need most of the columns.
If you very often need to calculate things based on information from a lot of
rows (like counts of things), it's probably much better to introduce a new table
and update the counter in real time. An update of type UPDATE table set
count=count+1 where index_column=constant is very fast! This is really important
when you use databases like MySQL that only have table locking (multiple readers
/ single writers). This will also give better performance with most databases,
as the row locking manager in this case will have less to do.
If you need to collect statistics from big log tables, use summary tables
instead of scanning the whole table. Maintaining the summaries should be much
faster than trying to do statistics 'live'. It's much faster to regenerate new
summary tables from the logs when things change (depending on business
decisions) than to have to change the running application! If possible, one
should classify reports as 'live' or 'statistical', where data needed for
statistical reports are only generated based on summary tables that are
generated from the actual data.
Take advantage of the fact that columns have default values. Insert values
explicitly only when the value to be inserted differs from the default. This
reduces the parsing that MySQL need to do and improves the insert speed. In some
cases it's convenient to pack and store data into a blob. In this case you have
to add some extra code in your appliction to pack/unpack things in the blob, but
this may save a lot of accesses at some stage. This is practical when you have
data that doesn't conform to a static table structure.
Normally you should try to keep all data non-redundant (what is called 3rd
normal form in database theory), but you should not be afraid of duplicating
things or creating summary tables if you need these to gain more speed.
Stored procedures or UDF (user-defined functions) may be a good way to get more
performance. In this case you should, however, always have a way to do this some
other (slower) way if you use some database that doesn't support this. You can
always gain something by caching queries/answers in your application and trying
to do many inserts/updates at the same time. If your database supports lock
tables (like MySQL and Oracle), this should help to ensure that the index cache
is only flushed once after all updates.
Use INSERT /*! DELAYED */ when you do not need to know when your data is
written. This speeds things up because many records can be written with a single
disk write.
Use INSERT /*! LOW_PRIORITY */ when you want your selects to be more important.
Use SELECT /*! HIGH_PRIORITY */ to get selects that jump the queue. That is, the
select is done even if there is somebody waiting to do a write.
Use the multi-line INSERT statement to store many rows with one SQL command
(many SQL servers supports this).
Use LOAD DATA INFILE to load bigger amounts of data. This is faster than normal
inserts and will be even faster when myisamchk is integrated in mysqld.
Use AUTO_INCREMENT columns to make unique values.
Use OPTIMIZE TABLE once in a while to avoid fragmentation when using dynamic
table format.
Use HEAP tables to get more speed when possible.
When using a normal Web server setup, images should be stored as files. That is,
store only a file reference in the database. The main reason for this is that a
normal Web server is much better at caching files than database contents. So it
it's much easier to get a fast system if you are using files.
Use in memory tables for non-critical data that are accessed often (like
information about the last shown banner for users that don't have cookies).
Columns with identical information in different tables should be declared
identical and have identical names. Before Version 3.23 you got slow joins
otherwise. Try to keep the names simple (use name instead of customer_name in
the customer table). To make your names portable to other SQL servers you should
keep them shorter than 18 characters.
If you need REALLY high speed, you should take a look at the low-level
interfaces for data storage that the different SQL servers support! For example,
by accessing the MySQL MyISAM directly, you could get a speed increase of 2-5
times compared to using the SQL interface. To be able to do this the data must
be on the same server as the application, and usually it should only be accessed
by one process (because external file locking is really slow). One could
eliminate the above problems by introducing low-level MyISAM commands in the
MySQL server (this could be one easy way to get more performance if needed). By
carefully designing the database interface, it should be quite easy to support
this types of optimization. In many cases it's faster to access data from a
database (using a live connection) than accessing a text file, just because the
database is likely to be more compact than the text file (if you are using
numerical data), and this will involve fewer disk accesses. You will also save
code because you don't have to parse your text files to find line and column
boundaries. You can also use replication to speed things up.
Declaring a table with DELAY_KEY_WRITE=1 will make the updating of indexes
faster, as these are not logged to disk until the file is closed. The downside
is that you should run myisamchk on these tables before you start mysqld to
ensure that they are okay if something killed mysqld in the middle. As the key
information can always be generated from the data, you should not lose anything
by using DELAY_KEY_WRITE.

Daily JOBS






Get Your Dream Job! Post Your Resume Today!
Get Your Dream Job! Post Your Resume Today!