# 사용자 정의 결과 함수를 사용해 추정치 반박하기

* 작성자: 경윤영
* 원문: [Dowhy -Simple Example on Creating a Custom Refutation Using User-Defined Outcome Functions](https://microsoft.github.io/dowhy/example_notebooks/dowhy_demo_dummy_outcome_refuter.html)

본 글은 Refute 방법론을 사용하는 코드 레퍼런스를 남기는 목적으로 작성되었습니다. 그렇기에 Refute에 대한 학습이 필요하신 경우 DOWHY KEY CONCEPTS의 [추정치를 검증하는 방법](https://playinpap.gitbook.io/dowhy/dowhy-key-concepts/sensitivity-analysis) 을 참고하시길 바랍니다.

본 실험에서는 선형데이터(linear dataset)를 구축하였고, 선형 회귀를 estimator로 사용합니다.

Refute의 방법으로는 outcome을 대체하는 dummy outcome refuter 방법을 사용하여 진행합니다.

## 1. Insert Dependencies

```python
from dowhy import CausalModel
import dowhy.datasets
import pandas as pd 
import numpy as np 
# Config dict to set the logging level
import logging.config
DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers':False,
    'loggers': {
        '': {
            'level': 'WARN',
        },
    }
}

logging.config.dictConfig(DEFAULT_LOGGING)
```

## 2. Create the dataset

Hyper parameter의 값을 바꾸며 effect의 변화를 확인할 수 있습니다.

| Variable Name | Data Type | Interpretation            |
| ------------- | --------- | ------------------------- |
| Zi            | float     | 도구변수(Instrument Variable) |
| Wi            | float     | 교란변수(Confounder)          |
| V0            | float     | 처치변수(Treatment)           |
| Y             | float     | 결과변수(Outcome)             |

```python
# Value of the coefficient [BETA]
BETA = 10
# Number of Common Causes
NUM_COMMON_CAUSES = 2
# Number of Instruments
NUM_INSTRUMENTS = 1
# Number of Samples
NUM_SAMPLES = 100000
# Treatment is Binary
TREATMENT_IS_BINARY =False
data = dowhy.datasets.linear_dataset(beta=BETA,
                                 num_common_causes=NUM_COMMON_CAUSES,
                                 num_instruments=NUM_INSTRUMENTS,
                                 num_samples=NUM_SAMPLES,
                                 treatment_is_binary=TREATMENT_IS_BINARY)
data['df'].head()
```

|   | Z0  | W0       | W1        | V0        | y          |
| - | --- | -------- | --------- | --------- | ---------- |
| 0 | 1.0 | 0.112689 | -0.501474 | 8.076574  | 80.106461  |
| 1 | 0.0 | 0.645347 | -0.072829 | -0.219279 | -0.092377  |
| 2 | 0.0 | 0.323480 | 0.989825  | 0.365947  | 6.900517   |
| 3 | 0.0 | 0.030437 | 1.334423  | 1.740524  | 20.319910  |
| 4 | 1.0 | 1.377841 | 0.628397  | 11.938058 | 125.523936 |

## 3. Creating the Causal Model

```python
model = CausalModel(
    data = data['df'],
    treatment = data['treatment_name'],
    outcome = data['outcome_name'],
    graph = data['gml_graph'],
    instruments = data['instrument_names']
)

model.view_model()
```

아래의 그림은 처치변수(treatment), 결과변수(outcome), 교란변수(confounders), 도구변수(instrument variable)의 관계를 살펴볼 수 있습니다.

W0, W1 : 교란변수(confounders)

Z0 : 도구변수(instrument variable)

V0: 처치변수(treatment)

Y: 결과변수(outcome)

![변수들의 관계](https://user-images.githubusercontent.com/39981604/153433647-39b2fd58-d7f1-485c-9f5e-39e1aa8e8899.png)

## 4. Identify the Estimand

```python
identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)
print(identified_estimand)
```

```
Estimand type: nonparametric-ate

### Estimand : 1
Estimand name: backdoor
Estimand expression:
  d
─────(Expectation(y|W1,W0))
d[v₀]
Estimand assumption 1, Unconfoundedness: If U→{v0} and U→y then P(y|v0,W1,W0,U) = P(y|v0,W1,W0)

### Estimand : 2
Estimand name: iv
Estimand expression:
Expectation(Derivative(y, [Z0])*Derivative([v0], [Z0])**(-1))
Estimand assumption 1, As-if-random: If U→→y then ¬(U →→{Z0})
Estimand assumption 2, Exclusion: If we remove {Z0}→{v0}, then ¬({Z0}→y)

### Estimand : 3
Estimand name: frontdoor
No such variable found!
```

## 5. Estimating the Effect

```python
causal_estimate = model.estimate_effect( identified_estimand,
                                       method_name="iv.instrumental_variable",
                                       method_params={'iv_instrument_name':'Z0'}
                                       )
print(causal_estimate)
```

```
*** Causal Estimate ***

## Identified estimand
Estimand type: nonparametric-ate

### Estimand : 1
Estimand name: iv
Estimand expression:
Expectation(Derivative(y, [Z0])*Derivative([v0], [Z0])**(-1))
Estimand assumption 1, As-if-random: If U→→y then ¬(U →→{Z0})
Estimand assumption 2, Exclusion: If we remove {Z0}→{v0}, then ¬({Z0}→y)

## Realized estimand
Realized estimand: Wald Estimator
Realized estimand type: nonparametric-ate
Estimand expression:
                                                              -1
Expectation(Derivative(y, Z0))⋅Expectation(Derivative(v0, Z0))
Estimand assumption 1, As-if-random: If U→→y then ¬(U →→{Z0})
Estimand assumption 2, Exclusion: If we remove {Z0}→{v0}, then ¬({Z0}→y)
Estimand assumption 3, treatment_effect_homogeneity: Each unit's treatment ['v0'] is affected in the same way by common causes of ['v0'] and y
Estimand assumption 4, outcome_effect_homogeneity: Each unit's outcome y is affected in the same way by common causes of ['v0'] and y

Target units: ate

## Estimate
Mean value: 9.99706705820163
```

## 6. Refuting the Estimate

### 1) Using a Randomly Generated Outcome

```python
ref = model.refute_estimate(identified_estimand,
                           causal_estimate,
                           method_name="dummy_outcome_refuter"
                           )
print(ref[0])
```

```
Refute: Use a Dummy Outcome
Estimated effect:0
New effect:4.657654751543888e-05
p value:0.49
```

### 2) Using a Function that Generates the Outcome from the Confounders

아래와 같이 새로운 Y를 생성하기 위해 교란변수를 사용한 선형식을 정의합니다.

$$y\_{new} = \beta\_0W\_0 + \beta\_1W\_1 + \gamma\_0$$

where, $$\beta\_0 = 1, \beta\_1 = 2, \gamma\_0 = 3$$

```python
coefficients = np.array([1,2])
bias = 3
def linear_gen(df):
    y_new = np.dot(df[['W0','W1']].values,coefficients) + 3
return y_new
```

```python
ref = model.refute_estimate(identified_estimand,
                           causal_estimate,
                           method_name="dummy_outcome_refuter",
                           outcome_function=linear_gen
                           )

print(ref[0])
```

```
Refute: Use a Dummy Outcome
Estimated effect:0
New effect:-1.1692081553648758e-05
p value:0.47
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://playinpap.gitbook.io/dowhy/example-notebooks/custom-refutation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
