Convert ArcGIS feature layer string fields storing boolean values to integer fields with domains

arcgis python

Issue

Screenshot of string field storing boolean attributes -- bad!

Storing boolean attributes (e.g. Yes/No values) in a string field without any domain is the most common data quality issue I encounter when managing feature layers in ArcGIS Online/Enterprise. Doing so introduces the potential for data inconsistency, increases the complexity of maintenance/analysis, and wastes space. If this issue gets introduced, it tends to persist, since there’s currently no easy fix.

Solution

Here’s a quick snippet for updating a feature layer string field storing Yes/No values to an integer field with a Yes=1/No=0 domain:

from getpass import getpass

from arcgis.gis import GIS

gis = GIS("https://arcgis.com", "<username>", getpass())

questionable_data_choices_item = gis.content.get("<item_id>")
questionable_data_choices_layer = questionable_data_choices_item.layers[<layer_index>]

# Create a temporary integer field to store the converted string values
temp_new_field = {
    "name": "TempActiveWell",
    "alias": "Temp Active Well",
    "type": "esriFieldTypeInteger",
    "editable": True,
    "nullable": True,
    "defaultValue": None,
    "description": None
}
questionable_data_choices_layer.manager.add_to_definition({"fields": [temp_new_field]})

# Convert string values to integers using SQL CASE expression
sql_expression = "CASE WHEN ActiveWell = 'Yes' THEN 1 WHEN ActiveWell = 'No' THEN 0 ELSE NULL END"
questionable_data_choices_layer.calculate(where="1=1", calc_expression={"field": "TempActiveWell", "sqlExpression": sql_expression})

# Now that values are converted and stored in temp field, delete original string field and recreate as integer field with domain
questionable_data_choices_layer.manager.delete_from_definition({"fields": [{"name": "ActiveWell"}]})
new_field = {
    "name": "ActiveWell",
    "alias": "Active Well",
    "type": "esriFieldTypeInteger",
    "editable": True,
    "nullable": True,
    "defaultValue": None,
    "description": None,
    "domain": {
        "type" : "codedValue",
        "codedValues" : [
            {
                "name" : "Yes",
                "code" : 1
            },
            {
                "name" : "No",
                "code" : 0
            }
        ]
    }
}
questionable_data_choices_layer.manager.add_to_definition({"fields": [new_field]})

# Transfer values from temp field to new integer domain field
questionable_data_choices_layer.calculate(where="1=1", calc_expression={"field": "ActiveWell", "sqlExpression": "TempActiveWell"})

# Delete temp field
questionable_data_choices_layer.manager.delete_from_definition({"fields": [{"name": "TempActiveWell"}]})

And the desired result: Screenshot of integer field with domain storing boolean attributes -- good!

Explanation

The first few lines are just ArcGIS API for Python boilerplate. I instantiate the GIS object with the appropriate user for the workflow (either the item owner or a user with admin privileges, necessary to update the item’s definition) and get the relevant feature layer.

Only note: you should never store your password, even temporarily, in your script. I chose to use the getpass module from Python’s standard library to avoid this, but if you’re working in an ArcGIS Pro Python environment or ArcGIS Online notebook, you can use pro and home authentication schemes respectively (e.g. gis = GIS("pro")).

The rest of the script “converts” the string field to an integer field with domain by:

  1. Creating a intermediate integer field to store the converted values using the FeatureLayerManager’s .add_to_definition() method
  2. Converting the values using the FeatureLayer’s .calculate() method with as SQL CASE expression (think if-else statement for SQL, or Arcade’s IIF() function)
  3. Deleting the original, erroneous string field sing the FeatureLayerManager’s .delete_from_definition() method and creating an integer field with domain under the same name (since you cannot change the type of an existing feature layer field, we have to delete and recreate as the new type)
  4. Copying the calculated values over from the temp field created in Step 1. and finally deleting it to clean up the feature layer