Performance impact of unplanned SnapMirror configurations
VOLUME SNAPMIRROR PERFORMANCE
Volume SnapMirror performance is centered on the update
frequency, the network bandwidth, and the storage system utilization. Volume
SnapMirror Async performance is particularly affected by the volume size, the
rate of data changed, and the disk geometry for traditional volumes.
Disk geometry
For versions of Data ONTAP earlier than 7.0 and traditional
volumes, it is recommended that the source and destination volumes contain
disks of the same size, and be organized in the same RAID group configuration
to gain optimal performance. For flexible volumes, disk geometry matching is no
longer a consideration.
Snapshot COPY creation and update frequency
SnapMirror creates a Snapshot copy before every update and
deletes a Snapshot copy at the end. On heavily loaded storage systems, Snapshot
copy creation time can stretch out and restricts the frequency of SnapMirror
updates. Stretched SnapMirror schedules result in SnapMirror creating many
Snapshot copies on the source storage system at the same time, which can impact
client access. For this reason staggered SnapMirror schedules are recommended
to avoid system blockages.
Volume size and changed blocks
To perform an incremental update, the block map in the new
Snapshot copy is compared to the block map in the baseline Snapshot copy. The
time required to determine the block changes depends on the volume size. With
Data ONTAP 7.0 and later, you can use the snap delta command to determine the
rate of data change between Snapshot copies on a volume.
QTREE SNAPMIRROR PERFORMANCE
Qtree SnapMirror performance is impacted by deep directory
structure and large numbers, such as tens of millions, of small files
replicated.
Directory structures and large numbers of small files
To determine changed data, qtree SnapMirror looks at the
inode file and defines which inodes are in the qtree of interest and which
inodes have changed. If the inode file is large, but the inodes of interest are
few, qtree SnapMirror spends a lot of time going through the inode file to find
very few changes. Disk I/Os used to access the data become small and
inefficient.
Transfer size
When a qtree SnapMirror update is transferring, the snapmirror
status –l command shows how many kilobytes have been transferred so far; the
value may be greater than the expected delta (changes expected). This overhead
is due to metadata transfer, for example: 4-KB header, file creation, deletion,
ACLs, and so on.
Few more points :
CONCURRENT TRANSFER LIMITATION
The transfer fails when the system reaches the maximum
number of simultaneous replication operations. Each transfer beyond the limit
will reattempt to run once per minute.
To optimize SnapMirror deployment, it is recommended that
the schedules be staggered. For qtree SnapMirror, if there are too many qtrees
per destination volume, the solution is to re-baseline those qtrees to another
volume.
CPU UTILIZATION
SnapMirror may have some impact, but in the majority of
cases, it is not very significant.
You can monitor storage system CPU using Operations Manager
Performance Advisor or the Data ONTAP sysstat command
SYSTEM ACTIVITIES
On heavily loaded systems, SnapMirror competes with other
processes and may impact response times.
To address this problem you can set the system priority to
High or Very High on dedicated storage systems for SnapMirror replication using
FlexShare® software.
Schedule SnapMirror updates : You can also schedule SnapMirror updates at times when NFS
or CIFS traffic is low and reduce the frequency of updates.
NETWORK DISTANCE AND BANDWIDTH
When deploying SnapMirror, you have to consider the
round-trip travel time of a packet from the source to the destination storage
system, because network distance causes write latency. The round trip has a
latency of approximately 2 milliseconds if the source and the destination
storage systems are 100 miles apart.
Networking issues impacting SnapMirror performance can be
addressed by limiting the bandwidth using the system-wide or per-transfer
network throttle features.
Networking issues can also be addressed by using a dedicated
path for SnapMirror transfers or using multiple paths for load balancing and
failover.
If the network still does not perform up to expectations,
look for typical network problems. For example, duplex mismatches can cause
networks to be very slow.