본 글은 Refute 방법론을 사용하는 코드 레퍼런스를 남기는 목적으로 작성되었습니다. 그렇기에 Refute에 대한 학습이 필요하신 경우 DOWHY KEY CONCEPTS의 추정치를 검증하는 방법 을 참고하시길 바랍니다.
본 실험에서는 선형데이터(linear dataset)를 구축하였고, 선형 회귀를 estimator로 사용합니다.
Refute의 방법으로는 outcome을 대체하는 dummy outcome refuter 방법을 사용하여 진행합니다.
1. Insert Dependencies
from dowhy import CausalModelimport dowhy.datasetsimport pandas as pd import numpy as np # Config dict to set the logging levelimport logging.configDEFAULT_LOGGING ={'version':1,'disable_existing_loggers':False,'loggers':{'':{'level':'WARN',},}}logging.config.dictConfig(DEFAULT_LOGGING)
2. Create the dataset
Hyper parameter의 값을 바꾸며 effect의 변화를 확인할 수 있습니다.
Variable Name
Data Type
Interpretation
Zi
float
도구변수(Instrument Variable)
Wi
float
교란변수(Confounder)
V0
float
처치변수(Treatment)
Y
float
결과변수(Outcome)
# Value of the coefficient [BETA]BETA =10# Number of Common CausesNUM_COMMON_CAUSES =2# Number of InstrumentsNUM_INSTRUMENTS =1# Number of SamplesNUM_SAMPLES =100000# Treatment is BinaryTREATMENT_IS_BINARY =Falsedata = dowhy.datasets.linear_dataset(beta=BETA, num_common_causes=NUM_COMMON_CAUSES, num_instruments=NUM_INSTRUMENTS, num_samples=NUM_SAMPLES, treatment_is_binary=TREATMENT_IS_BINARY)data['df'].head()
Z0
W0
W1
V0
y
0
1.0
0.112689
-0.501474
8.076574
80.106461
1
0.0
0.645347
-0.072829
-0.219279
-0.092377
2
0.0
0.323480
0.989825
0.365947
6.900517
3
0.0
0.030437
1.334423
1.740524
20.319910
4
1.0
1.377841
0.628397
11.938058
125.523936
3. Creating the Causal Model
model =CausalModel( data = data['df'], treatment = data['treatment_name'], outcome = data['outcome_name'], graph = data['gml_graph'], instruments = data['instrument_names'])model.view_model()
아래의 그림은 처치변수(treatment), 결과변수(outcome), 교란변수(confounders), 도구변수(instrument variable)의 관계를 살펴볼 수 있습니다.
*** Causal Estimate ***
## Identified estimand
Estimand type: nonparametric-ate
### Estimand : 1
Estimand name: iv
Estimand expression:
Expectation(Derivative(y, [Z0])*Derivative([v0], [Z0])**(-1))
Estimand assumption 1, As-if-random: If U→→y then ¬(U →→{Z0})
Estimand assumption 2, Exclusion: If we remove {Z0}→{v0}, then ¬({Z0}→y)
## Realized estimand
Realized estimand: Wald Estimator
Realized estimand type: nonparametric-ate
Estimand expression:
-1
Expectation(Derivative(y, Z0))⋅Expectation(Derivative(v0, Z0))
Estimand assumption 1, As-if-random: If U→→y then ¬(U →→{Z0})
Estimand assumption 2, Exclusion: If we remove {Z0}→{v0}, then ¬({Z0}→y)
Estimand assumption 3, treatment_effect_homogeneity: Each unit's treatment ['v0'] is affected in the same way by common causes of ['v0'] and y
Estimand assumption 4, outcome_effect_homogeneity: Each unit's outcome y is affected in the same way by common causes of ['v0'] and y
Target units: ate
## Estimate
Mean value: 9.99706705820163