Main menu

Pages

Exploratory Regression, Generate Network Spatial Weights

Exploratory Regression, Generate Network Spatial Weights Tools

Exploratory Regression

How to use Exploratory Regression Tool in Arc Toolbox??

Exploratory Regression Tool
Exploratory Regression

Path to access the tool

:

Exploratory Regression Tool, Modeling Spatial Relationships Toolset, Spatial Statistics Tools Toolbox

 

Exploratory Regression

The Exploratory Regression tool evaluates all possible combinations of the input candidate explanatory variables, looking for OLS models that best explain the dependent variable within the context of user-specified criteria.

You can access the results of this tool (including the optional report file) from the Results window. If you disable background processing, results will also be written to the Progress dialog box.

1.    Input Features

The feature class or feature layer containing the dependent and candidate explanatory variables to analyze.

2.    Dependent Variable

The numeric field containing the observed values you want to model using OLS.

3.    Candidate Explanatory Variables

A list of fields to try as OLS model explanatory variables.

4.    Weights Matrix File (optional)

A file containing spatial weights that define the spatial relationships among your input features. This file is used to assess spatial autocorrelation among regression residuals. You can use the Generate Spatial Weights Matrix File tool to create this. When you do not provide a spatial weights matrix file, residuals are assessed for spatial autocorrelation based on each feature's 8 nearest neighbors.

Note: The spatial weights matrix file is only used to analyze spatial structure in model residuals; it is not used to build or to calibrate any of the OLS models.

5.    Output Report File (optional)

The report file contains tool results, including details about any models found that passed all the search criteria you entered. This output file also contains diagnostics to help you fix common regression problems in the case that you don't find any passing models.

6.    Output Results Table (optional)

The optional output table created containing the explanatory variables and diagnostics for all of the models within the Coefficient p-value and VIF value cutoffs.

7.    Maximum Number of Explanatory Variables (optional)

All models with explanatory variables up to the value entered here will be assessed. If, for example, the Minimum Number of Explanatory Variables is 2 and the Maximum Number of Explanatory Variables is 3, the Exploratory Regression tool will try all models with every combination of two explanatory variables, and all models with every combination of three explanatory variables.

8.    Minimum Number of Explanatory Variables (optional)

This value represents the minimum number of explanatory variables for models evaluated. If, for example, the Minimum Number of Explanatory Variables is 2 and the Maximum Number of Explanatory Variables is 3, the Exploratory Regression tool will try all models with every combination of two explanatory variables, and all models with every combination of three explanatory variables.

9.    Minimum Acceptable Adj R Squared (optional)

This is the lowest Adjusted R-Squared value you consider a passing model. If a model passes all of your other search criteria, but has an Adjusted R-Squared value smaller than the value entered here, it will not show up as a Passing Model in the Output Report File. Valid values for this parameter range from 0.0 to 1.0. The default value is 0.05, indicating that passing models will explain at least 50 percent of the variation in the dependent variable.

10. Maximum Coefficient p value Cutoff (optional)

For each model evaluated, OLS computes explanatory variable coefficient p-values. The cutoff p-value you enter here represents the confidence level you require for all coefficients in the model in order to consider the model passing. Small p-values reflect a stronger confidence level. Valid values for this parameter range from 1.0 down to 0.0, but will most likely be 0.1, 0.05, 0.01, 0.001, and so on. The default value is 0.05, indicating passing models will only contain explanatory variables whose coefficients are statistically at the 95 percent confidence level (p-values smaller than 0.05). To relax this default you would enter a larger p-value cutoff, such as 0.1. If you are getting lots of passing models, you will likely want to make this search criteria more stringent by decreasing the default p-value cutoff from 0.05 to 0.01 or smaller.

11. Maximum VIF Value Cutoff (optional)

This value reflects how much redundancy (multicollinearity) among model explanatory variables you will tolerate. When the VIF (Variance Inflation Factor) value is higher than about 7.5, multicollinearity can make a model unstable; consequently, 7.5 is the default value here. If you want your passing models to have less redundancy, you would enter a smaller value, such as 5.0, for this parameter.

12. Minimum Acceptable Jarque Bera p value (optional)

The p-value returned by the Jarque-Bera diagnostic test indicates whether the model residuals are normally distributed. If the p-value is statistically significant (small), the model residuals are not normal and the model is biased. Passing models should have large Jarque-Bera p-values. The default minimum acceptable p-value is 0.1. Only models returning p-values larger than this minimum will be considered passing. If you are having trouble finding unbiased passing models, and decide to relax this criterion, you might enter a smaller minimum p-value such as 0.05.

13. Minimum Acceptable Spatial Autocorrelation p value (optional)

For models that pass all of the other search criteria, the Exploratory Regression tool will check model residuals for spatial clustering using Global Moran's I. When the p-value for this diagnostic test is statistically significant (small), it indicates the model is very likely missing key explanatory variables (it isn't telling the whole story). Unfortunately, if you have spatial autocorrelation in your regression residuals, your model is misspecified, so you cannot trust your results. Passing models should have large p-values for this diagnostic test. The default minimum p-value is 0.1. Only models returning p-values larger than this minimum will be considered passing. If you are having trouble finding properly specified models because of this diagnostic test, and decide to relax this search criteria, you might enter a smaller minimum such as 0.05.

Generate Network Spatial Weights

How to use Generate Network Spatial Weights Tool in Arc Toolbox??

Generate Network Spatial Weights Tool
Generate Network Spatial Weights

Path to access the tool

:

Generate Network Spatial Weights Tool, Modeling Spatial Relationships Toolset, Spatial Statistics Tools Toolbox

 

Generate Network Spatial Weights

Constructs a spatial weights matrix file (.swm) using a Network dataset, defining feature spatial relationships in terms of the underlying network structure.

1.    Input Feature Class

The point feature class for which network spatial relationships among features will be assessed.

2.    Unique ID Field

An integer field containing a different value for every feature in the input feature class. If you don't have a Unique ID field, you can create one by adding an integer field to your feature class table and calculating the field values to equal the FID or OBJECTID field.

3.    Output Spatial Weights Matrix File

The output network spatial weights matrix (.swm) file.

4.    Input Network

The network dataset for which spatial relationships among features in the input feature class will be defined. Network datasets most often represent street networks but may represent other kinds of transportation networks as well. The network dataset needs at least one time-based and one distance-based cost attribute.

5.    Travel Mode (optional)

The mode of transportation for the analysis. Custom is always a choice. For other travel modes to appear, they must be present in the network dataset specified in the Network Dataset parameter.

A travel mode is defined on a network dataset and provides override values for parameters that model car, truck, pedestrian, or other modes of travel.

6.    Impedance Attribute

The type of cost units to use as impedance in the analysis.

7.    U-turn Policy (optional)

Specifies optional U-turn restrictions.

  1. ALLOW_UTURNS—U-turns will be possible anywhere. This is the default
  2. NO_UTURNS—No U-turns will be allowed during navigation
  3. ALLOW_DEAD_ENDS_ONLY—U-turns will be possible only at dead ends (that is, single-valent junctions)
  4. ALLOW_DEAD_ENDS_AND_INTERSECTIONS_ONLY—U-turns will be possible only at dead ends and intersections

8.    Restrictions (optional)

A list of restrictions. Check the restrictions to be honored in spatial relationship computations.

9.    Use Hierarchy in Analysis (optional)

Specifies whether or not to use a hierarchy in the analysis.

  1. Checked—Will use the network dataset's hierarchy attribute in a heuristic path algorithm to speed analysis.
  2. Unchecked—Will use an exact path algorithm instead. If there is no hierarchy attribute, this option does not affect analysis.

10. Impedance Cutoff (optional)

Specifies a cutoff value for INVERSE and FIXED conceptualizations of spatial relationships. Enter this value using the units specified by the Impedance Attribute parameter.

A value of zero indicates that no threshold is applied. When this parameter is left blank, a default threshold value is computed based on input feature class extent and the number of features.

11. Maximum Number of Neighbors (optional)

An integer reflecting the maximum number of neighbors to find for each feature.

12. Barriers (optional)

The name of a point feature class with features representing blocked intersections, road closures, accident sites, or other locations where travel is blocked along the network.

13. Search Tolerance (optional)

The search threshold used to locate features in the Input Feature Class onto the network dataset. This parameter includes a search value and the units for the tolerance.

14. Time of Day (optional)

Specifies whether travel times should consider traffic conditions. Especially in urbanized areas, traffic conditions can significantly impact the area covered within a specified travel time. If no date or time is specified, the distance covered during a specified travel time will not be impacted by traffic.

15. Conceptualization of Spatial Relationships (optional)

Specifies how the weighting associated with each spatial relationship is specified.

  1. INVERSE—Features farther away have a smaller weight than features nearby.
  2. FIXED—Features within the Impedance Cutoff are neighbors (weight of 1); features outside the Impedance Cutoff are not weighted (weight of 0).

16. Exponent (optional)

Parameter for the INVERSE Conceptualization of Spatial Relationships calculation. Typical values are 1 or 2. Weights drop off quicker with distance as this exponent value increases.

17. Row Standardization (optional)

Row standardization is recommended whenever feature distribution is potentially biased due to sampling design or to an imposed aggregation scheme.

  1. Checked—Spatial weights are standardized by row. Each weight is divided by its row sum.
  2. Unchecked—No standardization of spatial weights is applied.

Comments

table of contents title