When it comes to map production one of the most common challenges is to manage all of your organization’s mapping standards. Esri Production Mapping’s views helps address this challenge. With views you’re able to save your data frame and layer properties to the geodatabase and apply them at any time in ArcMap. This ensures production staff are utilizing the latest and greatest map settings defined by your organization, and promotes standardization and consistency across your map products.
One of the most common problems that can occur when working with vector data is maintaining coincidence.
Maintaining coincidence between adjacent polygon features is important when modeling real word information in a GIS. There are a number of tools in ArcGIS 10 for Desktop that allow users to edit and create features that share boundaries, thus helping to eliminate gaps, slivers, and overlaps.
Previous blog posts have introduced you to data driven pages and product library but referred to them as separate, standalone tools. In this blog I’d like to show you how to use product library to manage your data driven pages. Now you might be asking why I would want to store my data driven pages in product library. Remember, product library helps you enforce and standardize your map production processes by centrally managing all of your production related information such as business rules, documents, workflows, and spatial information. By storing your data driven pages in product library you can take advantage of product library’s capabilities such as search, history tracking, check-in and check-out capabilities, permissions, and so forth.
In the following steps I’ll walk you through a basic workflow for importing and managing a US State map book in product library.
1. In ArcCatalog, create a new file geodatabase.
2. In ArcMap, add the Production Cartography toolbar and click the Product Library window button.
This post was written by geodatabase Product Engineer James MacKay. James works on the geodatabase development team and is responsible for a lot of the geodatabase SDK that is generated.
There are several methods in the Geodatabase API that use conformant arrays: these are C-style arrays that can be dimensioned at runtime. In most cases, if you see a method with a pair of related parameters – an integer indicating capacity (or something similar) and a pointer to an integer indicating values (or something similar) – it’s a safe bet that the method is looking for a conformant array.
A method with a conformant array parameter: ISelectionSet.AddList
Unfortunately, conformant arrays and Interop don’t jive… although they may work occasionally, eventually they’re going to cause problems. There are two workarounds: the GeoDatabaseHelper class and GEN interfaces.
The GeoDatabaseHelper class implements two interfaces, IGeoDatabaseBridge and IGeoDatabaseBridge2, which implement methods that can’t be used in a straightforward way through Interop. Three of the more common methods are IFeatureClass.GetFeatures, ISelectionSet.AddList, and ISelectionSet.RemoveList. The workarounds are IGeoDatabaseBridge.GetFeatures, IGeoDatabaseBridge2.AddList, and IGeoDatabaseBridge2.RemoveList, respectively.
GEN interfaces are identical to the interfaces with methods that use conformant arrays, but with the array type defined as SAFEARRAY instead of conformant arrays; they are implemented by the same classes. There are four GEN interfaces defined in the esriGeodatabase library: INetTopologyEditGEN, IForwardStarGEN, IUtilityNetworkGEN, and IEnumNetEIDBuilderGEN.
Just a heads up to tell you that we’ve posted a georeferencing video in the raster section of the help. We’ve been receiving a lot of support calls about georeferencing so hopefully this video will help people understand and implement it a little better.
If you’re having troubles georeferencing raster data or are simply interested you should check out the video.
This blog entry has been taken from a podcast previously recorded by Derek Law from the geodatabase team. We really liked the podcast and thought there was some valuable info to be shared. To hear the podcast and check out other useful podcast topics visit: http://www.esri.com/news/podcasts/instructional_series.html
The performance of an enterprise level ArcSDE geodatabase is influenced by many factors, such as hardware configuration, network configuration, network traffic, and the number of concurrent users.
The tips in this blog entry are not database platform-specific, but they are general tips that will hopefully enable you to improve the performance of your enterprise geodatabase.
So, five best practices for maintaining an ArcSDE geodatabase are:
- Increase the frequency of updating statistics on tables
- Rebuild indexes on tables
- Plan parent-child version relationships carefully
- Compress the geodatabase often
- Monitor system resources
1. Increase the frequency of updating statistics on tables
Statistics in the database describe the column data stored in tables. They help the database Query Optimizer to estimate the selectivity of SQL expressions, and enable it to accurately assess the cost of different query plans. The optimizer then chooses the most efficient execution plan for retrieving and/or updating data in the database. Having poor statistics is a frequent cause of poor performance. Keeping accurate up-to-date statistics will help improve database performance, because this will enable the Query Optimizer to make more accurate assessments of query execution plans.
The frequency of updating statistics will depend on the editing activity in the geodatabase. Typically, more editing activity means you should update statistics more frequently. This is the responsibility of the database administrator, and not the ArcSDE software, which does not maintain statistics. You can update statistics for a table or feature class in ArcCatalog with the Analyze Component’s dialog box. It updates the statistics for the supporting tables that are associated with the selected object.
You should also update statistics on the SDE repository tables. This can be done with database management software. As a general rule of thumb, we suggest you update statistics at least weekly or monthly, and typically before and after a compress, which can be automated at the database level.
There is one exception: in situations where all users are editing just the SDE.DEFAULT version, you should just keep the statistics you collected before the compress. This will ensure that the query optimizer knows the delta tables are still active.
2. Rebuild indexes on tables
Indexes are used in a database to help speed up the retrieval of rows from a table, and they are also used by the database Query Optimizer when assessing query plans. As tables are modified by updates, inserts, and deletes of records, the corresponding indexes can become fragmented and unbalanced. This leads to increased I.O processing, which affects performance. This tip works in conjunction with the previous one. If you update statistics frequently, in turn you should consider rebuilding indexes if they are fragmented. Both actions will help improve performance.
In general, accurate statistics help to define a good index. You can assess the usefulness of an index with database management tools by monitoring its usage. Another benefit of rebuilding indexes is that you may reclaim disk space that was caused by its fragmentation. In versioned editing environments (where edits are performed daily), you may want to consider rebuilding indexes at regular intervals (for example, weekly or monthly), to keep performance degradation under control. We recommend you rebuild indexes after a compress. You can rebuild indexes within a database management program, or with ArcSDE commands.
For more information, see Knowledge Base (or KB) Article #24518, titled, FAQ: How can ArcSDE performance be improved?
3. Plan parent-child version relationships carefully
The versioning environment within an ArcSDE geodatabase enables users to implement and sustain complex business workflows. Typically the number of versions and how they are interrelated will depend on your business workflow. It is important to properly manage versions in the geodatabase, because poor version management will impact performance. You should keep the following in mind: every edit in the geodatabase is adding a state to the state tree. A state tree represents the total number of edits states stored in a geodatabase. Think of it conceptually like a flow chart diagram of circles and lines that flows from top to bottom. Each represents an edit state, and each state is linked by a line showing the edit history in the geodatabase.
A state tree, typically, has a structure similar to an upside-down tree, starting with one circle at the top (let’s say its zero), and flowing down in many branches. For example, a typical ArcSDE geodatabase may have approximately one million edits per day, resulting in hundreds of thousands of edit states in a state tree.
Ideally, you want to keep the state tree as simple and as small as possible. Versions are pointers to an edit state, and they will “pin” the state tree; in other words, they will keep its structure complicated. This can affect performance, because it may take queries longer to execute. Therefore, the more complex the versioning model (in other words, the more versions you have), means more potential records in the delta tables, which means potentially slower performance.
In general, you should try to do the following;
- Reconcile versions to the SDE.DEFAULT version as soon as you can.
- Delete versions when they are no longer needed.
- Avoid creating versions that will never be reconciled with SDE.DEFAULT.
You could also run multiple reconcile services, to reconcile without posting as many older versions as possible each evening. This operation will simplify the state tree, so that when a compress is finally executed, it will trim the state tree. Version management can be performed in the Version Management dialog box in ArcCatalog or ArcMap.
For more information, read the ESRI technical white paper titled Versioning Workflows on the ESRI support site.
4. Compress the geodatabase often
Compressing an ArcSDE geodatabase helps maintain database performance by removing unused data.
Specifically it does two things:
- First, it removes unreferenced dates, and their associated delta table rows.
- Second, it moves entries in the delta tables that are common to all versions into the base tables, thus reducing the amount of data that the database searches through when executing queries. In effect, a compress will improve query performance and system response time by reducing the depth and complexity of the state tree.
When a large volume of uncompressed changes have accumulated in an ArcSDE geodatabase, a compress operation can take hours or even days. This is another very common cause of poor performance. To avoid this, you should compress on a regular basis (daily, weekly, and after periods of high editing activity). Users can stay connected to the geodatabase during a compress, but we suggest that all users be disconnected for the compress operation to be fully effective.
Remember to update statistics before and after a compress, and note the one exception mentioned earlier. The compress command is available in ArcCatalog. You add the command from the Customize dialog box, and you must be connected as the SDE user to execute it, or you could execute a compress with SDE commands.
For more information, see KB Article #29160 titled How to Compress a Version Database to State Zero.
5. Monitor system resources
When experiencing intermittent performance issues, it may be helpful to monitor the memory and CPU usage on both the client and server machines. This may help identify on which machine the performance bottleneck is occurring. For memory, it is important to ensure that the operating system is not running out of available memory and using swap space (in other words, virtual memory). Enterprise level ArcSDE typically needs at least one gigabyte of free disk space to operate efficiently. For CPU, you want to avoid and reduce how often the system hits a hundred percent CPU usage. Some troubleshooting suggestions to improve server performance include:
- Closing unrelated applications on the server
- Performing a database trace to examine and review performance (what’s in the database)
- You could have users switch from application server connections to direct connects (this will put more workload on the client and less on the server)
For tips on improving client performance, refer to the ESRI Instructional Series Podcast titled Performance Tips and Tricks: ArcSDE Client-Side Optimization.
So, just to review, the performance of an ArcSDE geodatabase is influenced by many factors: hardware configuration, network configuration, network traffic, and the number of concurrent users.
The five best practices for maintaining an ArcSDE geodatabase covered in this post were:
- Increase the frequency of updating statistics on tables
- Rebuild indexes on tables
- Plan parent-child version relationships carefully
- Compress the geodatabase often
- Monitor system resources
For more information, see the help topic An Overview of Tuning an ArcSDE Geodatabase.
ESRI also offers several instructor-led training classes on the configuration and tuning of ArcSDE geodatabases, based on DB2, Informix, Oracle and SQL Server database platforms.
Now that the dev summit is over we can tone down the amateur journalism and get into some real topics. Seeing as how this blog is titled “Inside the Geodatabase”, I thought a good place to start would be an introductory topic on the geodatabase. So here is the first in a series of posts we’re calling “Geodatabase Essentials”. Future posts tagged with this title will contain introductory information laying the foundations for essential geodatabase topics.
What is the Geodatabase?
The geodatabase is the native data storage and data management framework for ArcGIS. Why would you want to use a geodatabase? Because it acts as an organizational tool to store and manage your data, and is also the gateway into advanced GIS capabilities.
The geodatabase is a container which houses a collection of various geographic datasets.
Geodatabases support all of the different types of data that can be used by ArcGIS. Also, there is a complete set of conversion tools available so you can easily migrate existing geospatial data into the geodatabase.
At face value the fundamental ArcGIS datasets are tables, feature classes, and rasters. These and other more complex datasets, such as topologies and geometric networks, are all contained within the geodatabase. The geodatabase can also add advanced capabilities to these datasets and model behavior. Some examples of this are:
- Data Validation using domains and subtypes
- Multiuser editing environment through versioning
- Topologies to enforce the integrity of your spatial data
- Networks to model and analyze flows
- Terrains for modeling surfaces and Lidar data management
- Distributed geodatabase replication
- Managing historical archives of your data
There are 3 types of geodatabases: Personal, File, and ArcSDE.
Personal geodatabases were first introduced in ArcGIS 8.0 and are designed for a single user working with smaller datasets. They are stored and managed in Microsoft® Access™, which ties them to the windows platform.
One thing a lot of users like about the personal geodatabase is the ability to manage the tabular data using Access.
Access based personal geodatabases work well for small datasets and they support all the features of the geodatabase model such as topologies, raster catalogs, network datasets, address locators, etc…. They are single user and therefore do not support versioning and long transactions.
File geodatabases, introduced at ArcGIS 9.2, store datasets in a file system folder and are portable across operating systems. They are suitable for single user projects and small workgroups with one editor and multiple readers. Although they do not support versioning, it is possible to have multiple editors with a file geodatabase, providing they aren’t editing the same feature datasets, feature classes or tables.
The file geodatabase is optimized for use in ArcGIS so it provides very fast data use and storage, and can scale to over 1 terabyte in size. Also, the File geodatabase allows you to optionally compress your vector data, reducing the memory footprint used by its storage without affecting performance.
ArcSDE geodatabases manage spatial data within an RDBMS such as DB2, Informix, Oracle, SQL Server, PostgreSQL and SQL Server Express. Through this architecture, ArcSDE offers a multi-user editing environment and can manage extremely large datasets. ArcSDE geodatabases also support version-based workflows such as geodatabase replication and archiving that are not supported with file and personal geodatabases.
Organizations requiring the full suite of geodatabase functionality and a geodatabase with the capacity for extremely large, continuous GIS datasets that can be edited and accessed by many users should use an ArcSDE geodatabase.
9.3 Beta users can find this information and more on the Geodatabase Resource Center page.
Also for beta users, a few useful topics from the help system covering this information in more detail are: An overview of the geodatabase, Essential readings about the geodatabase, and Types of Geodatabases.
A few sessions were offered by the geodatabase team on the final day of the Dev Summit. Developing with Rasters in ArcGIS and Distributed Geodatabase Development were given in the morning, and the second offering of Effective Geodatabase Programming was given in the afternoon.
The raster session, given by Hong Xu, Joe Roubal, and Peng Gao, discussed typical developing programming patterns for creating raster centric applications. The main topics were: how to create and visualize your raster data, how to create custom geodata transformations, creating custom pixel filters, and how to use the new APIs in 9.3 to work with image services and WCS services.
Here are some sample slides from the presentation: the following slide shows how developers can create custom geodata transformations based on their own image formats and plug these into ArcGIS. This way their proprietary images can be used by ArcGIS.
The talk also looked into how developers can access image services (shown below) and WCS services to get a raster from the layer and use it for spatial analysis operations.
In the distributed data session Gary MacDougal and Khaled Hassen started with an overview of distributing data techniques before presenting the key elements of geodatabase replication and the replication API.
The following slide shows some common use cases for distributing geodatabases which can be accommodated through geodatabase replication.
As a sample of some of the code shown during the session, this next slide describes how a developer can extend replica creation through custom behavior.
After the morning presentations, Jim McKinney MCed the Closing Session while everyone was finishing a sit down lunch in the Oasis room. He went over some feedback from attendees of the conference, basically describing what people thought we did well and some things that could be done better. In raise-your-hand survey fashion Jim gathered some feedback in the room about conference specifics such as: The length of the conference, the time and length of sessions, session topics, should we offer “twilight sessions” (a handful of hardcore developers who crave night-time sessions raised their hands for this), should we have sessions based on a user track, etc…
The overall consensus was that we struck a sweet spot this year as far as the length of the conference, staffing, and session content. This was our biggest and probably our best Developer Summit so far.
From our teams perspective, we gathered a lot of useful information and feedback from users and felt as though we were able to help a lot of developers find answers to the questions that they had. Our Meet the Development Team session on Wednesday was a good indication of the value of personal interaction with the developer community.
Thanks to everyone that attended and to everyone that came to the Geodatabase Island to talk with the development team. We hope the Developer Summit was an interesting and beneficial experience for you – It certainly was for us.
Wednesday morning started with a talk from Alan Cooper, this years keynote speaker. The “Father of Visual Basic” delivered an insightful presentation titled Post Industrial Management. The talk compared past management strategies of the atom based industrial age to management strategies of today’s knowledge based era. He delved into a structured approach to managing developers by segregating them into three focused job descriptors which he labeled Interaction Designer, Design Engineer and Production Engineer.
During the talk Cooper threw out many shrewd nuggets of wisdom as well as humorous and accurate observations, such as “Building software is like walking through a minefield. If you don’t hit a mine it’s really quick”. His knack for pairing key concepts and ideas with aptly chosen metaphors made for a lighthearted and instructive presentation. All in all it was a very interesting talk that inspired a great deal of discussion throughout the day.
We aren’t allowed to distribute the actual presentation, but many of the concepts are covered in a similar article on Cooper’s site titled Design Engineering: The Next Step
In the afternoon, Forrest Jones and Brent Pierce gave the Implementing Enterprise Applications with the Geodatabase session.
This session was designed to take an enterprise centric view on common APIs that enterprise developers regularly need to work with. The session also explored many different tips and tricks to improve overall application performance:
It finished on a practical note with a group of slides covering database tuning and tracing of the complete enterprise Geodatabase stack.
The morning of the Dev Summit was spent in the Plenary Session. The Plenary acts as a presentation to look at some new product functionality and more recent projects that have been developed. This years Plenary went smoothly and the talks and demos did a great job of highlighting some key projects that the development teams at ESRI have been working on lately.
There was a good flow to the presentation as Jim McKinney used the newly launched Resource Center as a staging point to introduce each of the development teams and their respective lead developers.
Each team did a good job of not only looking at recent projects from a user perspective, but also delving into the developer perspective, showing code and programming examples of how things work behind the scenes.
Following the Plenary session the technical tracks kicked off. From the geodatabase perspective the “Effective Geodatabase Programming” session was presented by Brent Pierce and Erik Hoel, a senior developer on the geodatabase team. This session dealt with very low level programming patterns that should be followed when programming with the geodatabase API. The presentation tackled the subjects that a developer needs to be knowledgeable of to effectively use the geodatabase API.
Here are some slides highlighting the session contents as a teaser for those that might want to grab the whole presentation on EDN after the conference.
Tom also went into detail about how to work with the various spatial types available on the Oracle DBMS platform. Here is a slide highlighting some new operations added at 9.3.
Tomorrow the Geodatabase Team will be giving two sessions: Implementing Enterprise Applications with the Geodatabase and the first of two offerings of the Distributed Geodatabase Development session which delves into geodatabase replication.
Also, tomorrow morning Alan Cooper is giving a keynote speaker address which should make for an interesting talk and generate some buzz and discussion.