Category: Geodata

Developer Tips – Using the Field Checker

This post was written by geodatabase Product Engineer James MacKay. James works on the geodatabase development team and is responsible for a lot of the geodatabase SDK that is generated.

The IFieldChecker interface provides a way to validate fields for a specific workspace before they’re created. Reserved keywords, special characters and maximum field name lengths are properties that vary between different types of workspaces and DBMSs; the field checker will not only detect fields that violate these rules, but it will generate a “fixed” fields collection with a similar (but valid) name. There are three common cases where field checkers are useful:

A field checker should always be used whenever an application allows users to manually enter field names, but it’s a good defensive programming pattern to use them all of the time. The following code shows how to validate fields prior to a CreateTable or CreateFeatureClass call.

Geoprocessing users: The geoprocessor also exposes field checking capability. See this article for more information.

public IFields ValidateFields(IWorkspace workspace, IFields fields)

{

      // Create and initialize a field checker.

      IFieldChecker fieldChecker = new FieldCheckerClass();

      fieldChecker.ValidateWorkspace = workspace;

                 

      // Generate a validated fields collection.

      IFields validatedFields = null;

      IEnumFieldError enumFieldError = null;

      fieldChecker.Validate(fields, out enumFieldError, out validatedFields);

 

      // You can either notify the user of any errors or just skip this step.

      IFieldError fieldError = null;

      while ((fieldError = enumFieldError.Next()) != null)

      {

            Console.WriteLine(“Error in field {0}: {1}”, fieldError.FieldIndex, fieldError.FieldError);

      }

 

      // Return the validated fields.

      return validatedFields;

}

Posted in Geodata | Tagged , , | 4 Comments

New Georeferencing Video in Raster Web Help

Just a heads up to tell you that we’ve posted a georeferencing video in the raster section of the help. We’ve been receiving a lot of support calls about georeferencing so hopefully this video will help people understand and implement it a little better.

If you’re having troubles georeferencing raster data or are simply interested you should check out the video.

Posted in Geodata | Tagged , , , | Leave a comment

Five Best Practices for Maintaining an ArcSDE Geodatabase

This blog entry has been taken from a podcast previously recorded by Derek Law from the geodatabase team. We really liked the podcast and thought there was some valuable info to be shared.  To hear the podcast and check out other useful podcast topics visit: http://www.esri.com/news/podcasts/instructional_series.html

The performance of an enterprise level ArcSDE geodatabase is influenced by many factors, such as hardware configuration, network configuration, network traffic, and the number of concurrent users.

The tips in this blog entry are not database platform-specific, but they are general tips that will hopefully enable you to improve the performance of your enterprise geodatabase.

So, five best practices for maintaining an ArcSDE geodatabase are:

  • Increase the frequency of updating statistics on tables
  • Rebuild indexes on tables
  • Plan parent-child version relationships carefully
  • Compress the geodatabase often
  • Monitor system resources

1. Increase the frequency of updating statistics on tables

Statistics in the database describe the column data stored in tables. They help the database Query Optimizer to estimate the selectivity of SQL expressions, and enable it to accurately assess the cost of different query plans. The optimizer then chooses the most efficient execution plan for retrieving and/or updating data in the database. Having poor statistics is a frequent cause of poor performance. Keeping accurate up-to-date statistics will help improve database performance, because this will enable the Query Optimizer to make more accurate assessments of query execution plans.

The frequency of updating statistics will depend on the editing activity in the geodatabase. Typically, more editing activity means you should update statistics more frequently. This is the responsibility of the database administrator, and not the ArcSDE software, which does not maintain statistics. You can update statistics for a table or feature class in ArcCatalog with the Analyze Component’s dialog box. It updates the statistics for the supporting tables that are associated with the selected object.

You should also update statistics on the SDE repository tables. This can be done with database management software. As a general rule of thumb, we suggest you update statistics at least weekly or monthly, and typically before and after a compress, which can be automated at the database level.

There is one exception: in situations where all users are editing just the SDE.DEFAULT version, you should just keep the statistics you collected before the compress. This will ensure that the query optimizer knows the delta tables are still active.

2. Rebuild indexes on tables

Indexes are used in a database to help speed up the retrieval of rows from a table, and they are also used by the database Query Optimizer when assessing query plans. As tables are modified by updates, inserts, and deletes of records, the corresponding indexes can become fragmented and unbalanced. This leads to increased I.O processing, which affects performance. This tip works in conjunction with the previous one. If you update statistics frequently, in turn you should consider rebuilding indexes if they are fragmented. Both actions will help improve performance.

In general, accurate statistics help to define a good index. You can assess the usefulness of an index with database management tools by monitoring its usage. Another benefit of rebuilding indexes is that you may reclaim disk space that was caused by its fragmentation. In versioned editing environments (where edits are performed daily), you may want to consider rebuilding indexes at regular intervals (for example, weekly or monthly), to keep performance degradation under control. We recommend you rebuild indexes after a compress. You can rebuild indexes within a database management program, or with ArcSDE commands.

For more information, see Knowledge Base (or KB) Article #24518, titled, FAQ: How can ArcSDE performance be improved?

3. Plan parent-child version relationships carefully

The versioning environment within an ArcSDE geodatabase enables users to implement and sustain complex business workflows. Typically the number of versions and how they are interrelated will depend on your business workflow. It is important to properly manage versions in the geodatabase, because poor version management will impact performance. You should keep the following in mind: every edit in the geodatabase is adding a state to the state tree. A state tree represents the total number of edits states stored in a geodatabase. Think of it conceptually like a flow chart diagram of circles and lines that flows from top to bottom. Each represents an edit state, and each state is linked by a line showing the edit history in the geodatabase.

A state tree, typically, has a structure similar to an upside-down tree, starting with one circle at the top (let’s say its zero), and flowing down in many branches. For example, a typical ArcSDE geodatabase may have approximately one million edits per day, resulting in hundreds of thousands of edit states in a state tree.

Ideally, you want to keep the state tree as simple and as small as possible. Versions are pointers to an edit state, and they will “pin” the state tree; in other words, they will keep its structure complicated. This can affect performance, because it may take queries longer to execute. Therefore, the more complex the versioning model (in other words, the more versions you have), means more potential records in the delta tables, which means potentially slower performance.

In general, you should try to do the following;

  • Reconcile versions to the SDE.DEFAULT version as soon as you can.
  • Delete versions when they are no longer needed.
  • Avoid creating versions that will never be reconciled with SDE.DEFAULT.

You could also run multiple reconcile services, to reconcile without posting as many older versions as possible each evening. This operation will simplify the state tree, so that when a compress is finally executed, it will trim the state tree. Version management can be performed in the Version Management dialog box in ArcCatalog or ArcMap.

For more information, read the ESRI technical white paper titled Versioning Workflows on the ESRI support site.

4. Compress the geodatabase often

Compressing an ArcSDE geodatabase helps maintain database performance by removing unused data.

Specifically it does two things:

  • First, it removes unreferenced dates, and their associated delta table rows.
  • Second, it moves entries in the delta tables that are common to all versions into the base tables, thus reducing the amount of data that the database searches through when executing queries. In effect, a compress will improve query performance and system response time by reducing the depth and complexity of the state tree.

When a large volume of uncompressed changes have accumulated in an ArcSDE geodatabase, a compress operation can take hours or even days. This is another very common cause of poor performance. To avoid this, you should compress on a regular basis (daily, weekly, and after periods of high editing activity). Users can stay connected to the geodatabase during a compress, but we suggest that all users be disconnected for the compress operation to be fully effective.

Remember to update statistics before and after a compress, and note the one exception mentioned earlier. The compress command is available in ArcCatalog. You add the command from the Customize dialog box, and you must be connected as the SDE user to execute it, or you could execute a compress with SDE commands.

For more information, see KB Article #29160 titled How to Compress a Version Database to State Zero.

5. Monitor system resources

When experiencing intermittent performance issues, it may be helpful to monitor the memory and CPU usage on both the client and server machines. This may help identify on which machine the performance bottleneck is occurring. For memory, it is important to ensure that the operating system is not running out of available memory and using swap space (in other words, virtual memory). Enterprise level ArcSDE typically needs at least one gigabyte of free disk space to operate efficiently. For CPU, you want to avoid and reduce how often the system hits a hundred percent CPU usage. Some troubleshooting suggestions to improve server performance include:

  • Closing unrelated applications on the server
  • Performing a database trace to examine and review performance (what’s in the database)
  • You could have users switch from application server connections to direct connects (this will put more workload on the client and less on the server)

For tips on improving client performance, refer to the ESRI Instructional Series Podcast titled Performance Tips and Tricks: ArcSDE Client-Side Optimization.

So, just to review, the performance of an ArcSDE geodatabase is influenced by many factors: hardware configuration, network configuration, network traffic, and the number of concurrent users.

The five best practices for maintaining an ArcSDE geodatabase covered in this post were:

  • Increase the frequency of updating statistics on tables
  • Rebuild indexes on tables
  • Plan parent-child version relationships carefully
  • Compress the geodatabase often
  • Monitor system resources

For more information, see the help topic An Overview of Tuning an ArcSDE Geodatabase.

ESRI also offers several instructor-led training classes on the configuration and tuning of ArcSDE geodatabases, based on DB2, Informix, Oracle and SQL Server database platforms.

Posted in Geodata | Tagged , , | 5 Comments

Geodatabase Essentials – Part I: What is the Geodatabase?

Now that the dev summit is over we can tone down the amateur journalism and get into some real topics. Seeing as how this blog is titled “Inside the Geodatabase”, I thought a good place to start would be an introductory topic on the geodatabase. So here is the first in a series of posts we’re calling “Geodatabase Essentials”. Future posts tagged with this title will contain introductory information laying the foundations for essential geodatabase topics.

What is the Geodatabase?

The geodatabase is the native data storage and data management framework for ArcGIS. Why would you want to use a geodatabase? Because it acts as an organizational tool to store and manage your data, and is also the gateway into advanced GIS capabilities.

The geodatabase is a container which houses a collection of various geographic datasets.

Geodatabases support all of the different types of data that can be used by ArcGIS. Also, there is a complete set of conversion tools available so you can easily migrate existing geospatial data into the geodatabase.

At face value the fundamental ArcGIS datasets are tables, feature classes, and rasters. These and other more complex datasets, such as topologies and geometric networks, are all contained within the geodatabase. The geodatabase can also add advanced capabilities to these datasets and model behavior. Some examples of this are:

  • Data Validation using domains and subtypes
  • Multiuser editing environment through versioning
  • Topologies to enforce the integrity of your spatial data
  • Networks to model and analyze flows
  • Terrains for modeling surfaces and Lidar data management
  • Distributed geodatabase replication
  • Managing historical archives of your data

There are 3 types of geodatabases: Personal, File, and ArcSDE.

Personal Geodatabases

Personal geodatabases were first introduced in ArcGIS 8.0 and are designed for a single user working with smaller datasets. They are stored and managed in Microsoft® Access™, which ties them to the windows platform.

One thing a lot of users like about the personal geodatabase is the ability to manage the tabular data using Access.

Access based personal geodatabases work well for small datasets and they support all the features of the geodatabase model such as topologies, raster catalogs, network datasets, address locators, etc…. They are single user and therefore do not support versioning and long transactions.

File Geodatabases

File geodatabases, introduced at ArcGIS 9.2, store datasets in a file system folder and are portable across operating systems. They are suitable for single user projects and small workgroups with one editor and multiple readers. Although they do not support versioning, it is possible to have multiple editors with a file geodatabase, providing they aren’t editing the same feature datasets, feature classes or tables.

The file geodatabase is optimized for use in ArcGIS so it provides very fast data use and storage, and can scale to over 1 terabyte in size. Also, the File geodatabase allows you to optionally compress your vector data, reducing the memory footprint used by its storage without affecting performance.

ArcSDE Geodatabases

ArcSDE geodatabases manage spatial data within an RDBMS such as DB2, Informix, Oracle, SQL Server, PostgreSQL and SQL Server Express. Through this architecture, ArcSDE offers a multi-user editing environment and can manage extremely large datasets. ArcSDE geodatabases also support version-based workflows such as geodatabase replication and archiving that are not supported with file and personal geodatabases.

Organizations requiring the full suite of geodatabase functionality and a geodatabase with the capacity for extremely large, continuous GIS datasets that can be edited and accessed by many users should use an ArcSDE geodatabase.

9.3 Beta users can find this information and more on the Geodatabase Resource Center page.

Also for beta users, a few useful topics from the help system covering this information in more detail are: An overview of the geodatabase, Essential readings about the geodatabase, and Types of Geodatabases.

Users not in the beta program can access similar topics covering this information in more detail from the 9.2 web help. See An overview of the geodatabase and Types of geodatabases.

Posted in Geodata | Tagged , | Leave a comment

From the Dev Summit – Thursday March 20, 2008

A few sessions were offered by the geodatabase team on the final day of the Dev Summit. Developing with Rasters in ArcGIS and Distributed Geodatabase Development were given in the morning, and the second offering of Effective Geodatabase Programming was given in the afternoon.

 

The raster session, given by Hong Xu, Joe Roubal, and Peng Gao, discussed typical developing programming patterns for creating raster centric applications. The main topics were: how to create and visualize your raster data, how to create custom geodata transformations, creating custom pixel filters, and how to use the new APIs in 9.3 to work with image services and WCS services.

 

 

Here are some sample slides from the presentation: the following slide shows how developers can create custom geodata transformations based on their own image formats and plug these into ArcGIS. This way their proprietary images can be used by ArcGIS.

 

 

The talk also looked into how developers can access image services (shown below) and WCS services to get a raster from the layer and use it for spatial analysis operations.

 

 

In the distributed data session Gary MacDougal and Khaled Hassen started with an overview of distributing data techniques before presenting the key elements of geodatabase replication and the replication API.

 

 

The following slide shows some common use cases for distributing geodatabases which can be accommodated through geodatabase replication.

 

 

As a sample of some of the code shown during the session, this next slide describes how a developer can extend replica creation through custom behavior.

 

 

After the morning presentations, Jim McKinney MCed the Closing Session while everyone was finishing a sit down lunch in the Oasis room. He went over some feedback from attendees of the conference, basically describing what people thought we did well and some things that could be done better. In raise-your-hand survey fashion Jim gathered some feedback in the room about conference specifics such as: The length of the conference, the time and length of sessions, session topics, should we offer “twilight sessions” (a handful of hardcore developers who crave night-time sessions raised their hands for this), should we have sessions based on a user track, etc…

 

The overall consensus was that we struck a sweet spot this year as far as the length of the conference, staffing, and session content. This was our biggest and probably our best Developer Summit so far.

 

From our teams perspective, we gathered a lot of useful information and feedback from users and felt as though we were able to help a lot of developers find answers to the questions that they had. Our Meet the Development Team session on Wednesday was a good indication of the value of personal interaction with the developer community.

 

Thanks to everyone that attended and to everyone that came to the Geodatabase Island to talk with the development team. We hope the Developer Summit was an interesting and beneficial experience for you – It certainly was for us.

Posted in Geodata | Tagged , | Leave a comment

From the Dev Summit – Wednesday March 19, 2008

Wednesday morning started with a talk from Alan Cooper, this years keynote speaker. The “Father of Visual Basic” delivered an insightful presentation titled Post Industrial Management. The talk compared past management strategies of the atom based industrial age to management strategies of today’s knowledge based era. He delved into a structured approach to managing developers by segregating them into three focused job descriptors which he labeled Interaction Designer, Design Engineer and Production Engineer.

During the talk Cooper threw out many shrewd nuggets of wisdom as well as humorous and accurate observations, such as “Building software is like walking through a minefield. If you don’t hit a mine it’s really quick”. His knack for pairing key concepts and ideas with aptly chosen metaphors made for a lighthearted and instructive presentation. All in all it was a very interesting talk that inspired a great deal of discussion throughout the day.

We aren’t allowed to distribute the actual presentation, but many of the concepts are covered in a similar article on Cooper’s site titled Design Engineering: The Next Step

In the afternoon, Forrest Jones and Brent Pierce gave the Implementing Enterprise Applications with the Geodatabase session.

 

 

  

This session was designed to take an enterprise centric view on common APIs that enterprise developers regularly need to work with. The session also explored many different tips and tricks to improve overall application performance:

 

 

It finished on a practical note with a group of slides covering database tuning and tracing of the complete enterprise Geodatabase stack.

 

Posted in Geodata | Tagged , | Leave a comment

From the ESRI Developer Summit – Tuesday March 18, 2008

The morning of the Dev Summit was spent in the Plenary Session. The Plenary acts as a presentation to look at some new product functionality and more recent projects that have been developed. This years Plenary went smoothly and the talks and demos did a great job of highlighting some key projects that the development teams at ESRI have been working on lately.

There was a good flow to the presentation as Jim McKinney used the newly launched Resource Center as a staging point to introduce each of the development teams and their respective lead developers.

 

Each team did a good job of not only looking at recent projects from a user perspective, but also delving into the developer perspective, showing code and programming examples of how things work behind the scenes.

Following the Plenary session the technical tracks kicked off. From the geodatabase perspective the “Effective Geodatabase Programming” session was presented by Brent Pierce and Erik Hoel, a senior developer on the geodatabase team. This session dealt with very low level programming patterns that should be followed when programming with the geodatabase API. The presentation tackled the subjects that a developer needs to be knowledgeable of to effectively use the geodatabase API.

 

Here are some slides highlighting the session contents as a teaser for those that might want to grab the whole presentation on EDN after the conference. 

 

The presenters also went into great detail about the inner-workings of the Geodatabase. The next slide explaining the client server cursor buffering model is a good example of this.

 

 

To keep the session up-beat Erik injected his classic sense of humor in some slides including interesting images on slides where he was highlighting bad programming patterns with the geodatabase API (most people got the point…).

 

 

Following this session our in house database guru’s Tom Brown, Kevin Watt and Brijesh Shrivastav gave their “Working with the Geodatabase Effectively Using SQL” session. This popular presentation lead to large crowds and a series of very interesting spill over discussions following the talk.  
 
This session went into detail on working with the geodatabase at the SQL level. The presenters dealt with PostgreSQL, SQL Server 2008 and Oracle DBMS platforms. Here is a slide given by Brijesh concerning working with geometries in the spatial type for PostgreSQL fuctionality being released in ArcGIS 9.3.

Tom also went into detail about how to work with the various spatial types available on the Oracle DBMS platform. Here is a slide highlighting some new operations added at 9.3.

Tomorrow the Geodatabase Team will be giving two sessions: Implementing Enterprise Applications with the Geodatabase and the first of two offerings of the Distributed Geodatabase Development session which delves into geodatabase replication.

Also, tomorrow morning Alan Cooper is giving a keynote speaker address which should make for an interesting talk and generate some buzz and discussion.

Posted in Geodata | Tagged , | Leave a comment

From the Dev Summit – Monday March 17, 2008

Craig Gillgrass, Colin Zwicker, Jessica Parteno and James MacKay gave a pre-conference talk today at the Developer Summit in Palm Springs called the Developers Guide to the Geodatabase. The session covered an overview of the geodatabase, some best practices, and showed several demos of how to work with the geodatabase API.

The talk drew a decent crowd of about 300 and generated some great discussion. Craig had a healthy 100% joke bombing percentage that drew some muffled sympathy laughs.

I pulled some slides from their presentation to post up on the blog and give you an idea of what the talk was about. All of the slides from the Dev Summit presentations will be publicly available on EDN. The full PDF of this presentation can be found HERE.

The presentation went into detail on tips and tricks, best practices, object model diagrams, and suggested programming patterns when using the geodatabase API.

The presentation outline for the session was the following:

The slide below gives a decent overview of the datasets that can be stored within a geodatabase, as well as some example behavior that the geodatabase can implement to manage data validation and data integrity:

This sample code on the efficient use of FindField was one of many helpful developer hints and best practices that were discussed during the session:

The presenters also touched on several key object model diagrams; this one describes the feature class object model which you need to understand when creating feature classes with the geodatabase API.

This pre-conference seminar was a good way to touch base with developers before the official kick-off to the 2008 Developer Summit.

Posted in Geodata | Tagged , | Leave a comment

The 2008 ESRI Developer Summit

We’re days away from the annual ESRI Developer Summit down in Palm Springs. A lot of team members from the geodatabase team are going to be there presenting technical sessions. We’ll also be able to meet one on one with you to talk about projects that you’re working on and answer questions.

Team members will be floating around our designated island in the showcase area in between sessions, so you can come and find us there. We’ll be more than happy to talk and there will be lots of team members with varied areas of expertise, so we can direct specific questions to the right person.

Perhaps the best time to mingle with the geodatabase team is during the ‘Meet the Development Team’ session. This is being held on Wednesday from 1:30 – 2:30 in the showcase area.

 

Our team is giving 5 technical sessions and 1 pre-conference seminar this year:

·        Developer’s Guide to the Geodatabase : Monday afternoon

·        Effective Geodatabase Programming : Tuesday 2:45 – 4:00, Thursday 1:30 – 2:45

·        Working with the geodatabase effectively using SQL : Tuesday 4:30 – 5:45

·        Implementing Enterprise Applications with the Geodatabase : Wednesday 2:45 – 4:00

·        Distributed Geodatabase Development : Wednesday 4:30 – 5:45, Thursday 8:30 – 9:45

·        Developing with Rasters in ArcGIS : Thursday 8:30 – 9:45

Members of the development team will be on location and posting to the blog throughout the Dev Summit, so check back for updates and info from each day of the conference. 

Posted in Geodata | Tagged | Leave a comment

Welcome Inside the Geodatabase

We’re excited to be launching this new Geodatabase-centric blog (unofficially titled ‘Inside the Geodatabase’).

 

The scope of this blog will range from introductory information regarding general Geodatabase functionality to some more advanced topics and developer related material. We’ll be blogging on things like best practices, new and existing Geodatabase functionality, example workflows and updates from events team members are part of.

 

Also, when new help documentation or SDK content is written we’ll throw it on here first, not only so that you’ll know it’s available, but also to give users a fresh first glance.

 

On top of this there will be other media content such as:

 

  • Code Examples
  • Graphics
  • Instructional Videos
  • Power Point Presentations
  • Podcasts
  • Etc…

This blog is written by the Geodatabase Development Team and we’re hoping that it grows into a valuable resource for the user community so stay tuned …

Posted in Geodata | Tagged | 3 Comments