Графический Закон Больших Чисел и его применения в задачах

advertisement
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ â çàäà÷àõ ïðèíÿòè ðåøåíèé
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Èíñòèòóò êèáåðíåòèêè èì. Â.Ì.Ãëóøêîâà, Óíèâåðñèòåò Êàëèôîðíèè (Äåâèñ)
Êèåâ, 11 èþíÿ 2013
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Ñîäåðæàíèå
Ââåäåíèå
Íåêîòîðûå ñâÿçè ìåæäó ñõîäèìîñòüþ ôóíêöèé è
ìèíèìóìîâ
Ãðàôè÷åñêàÿ ñõîäèìîñòü è åå ñëåäñòâèÿ
Çàäà÷è ïðèíÿòèÿ ðåøåíèé: ÿçûê âêëþ÷åíèé
Ñòîõàñòè÷åñêèå âêëþ÷åíèÿ
Çàêîíû áîëüøèõ ÷èñåë
Ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë
Îáñóæäåíèå ãðàôè÷åñêîãî çàêîíà áîëüøèõ ÷èñåë
Çàäà÷è ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Ââåäåíèå
Êàê ñâÿçàíû ñòàòèñòè÷åñêàÿ òåîðèÿ îáó÷åíèÿ è
òåîðèÿ ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ?
(ìíîãî îáùåãî, íî íå âçàèìîäåéñòâóþò).
Êàê ñâÿçàíû çàäà÷è ïðèíÿòèÿ ðåøåíèé (îïòèìèçàöèÿ)
è Çàêîí Áîëüøèõ ×èñåë (ÇÁ×) (òåîðèÿ âåðîÿòíîñòè)?
(ÇÁ× èñïîëüçóåòñÿ äëÿ àïïðîêñèìàöèè
ôóíêöèé-ìàòåìàòè÷åñêèõ îæèäàíèé).
×òî îçíà÷àåò ÃÐÀÔÈ×ÅÑÊÈÉ Çàêîí Áîëüøèõ ×èñåë?
(îáîáùåíèå ðàâíîìåðíîãî ÇÁ× íà ìíîãîçíà÷íûå ôóíêöèè).
×òî íîâîãî ïðèâíîñèò ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë?
(Ïîçâîëÿåò àïïðîêñèìèðîâàòü çàäà÷è ñ ðàçðûâàìè).
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Íåêîòîðûå ñâÿçè ìåæäó ñõîäèìîñòüþ ôóíêöèé è
ìèíèìóìîâ
Îñíîâíàÿ òåîðåìà (ñòàòèñòè÷åñêîé) òåîðèè îáó÷åíèÿ I:
=⇒ (b) Uniform one-sided
∀c : inf {R(x)≥c} Rn (x) → inf {R(x)≥c} R(x) ≥ c
(b) limn supx (R(x) − Rn (x)) ≤ 0
(c) limn Rn (x) = R(x) (pointwise)
(a) Consistency by Vapnik
conv.
(a)
Îñíîâíàÿ òåîðåìà (ñòàòèñòè÷åñêîé) òåîðèè îáó÷åíèÿ II:
Consistency by Vapnik
⇐=
(b) Uniform one-sided conv.
+(c) pointwise convegence
⇓
Âàæíàÿ òåîðåìà âàðèàöèîííîãî àíàëèçà, Rockafellar and
Wets (1998):
⇐= (e) Epi-graphical
(d) inf x∈X Rn (x) → inf x∈X R(x).
(e) limn (R(x) − Rn (xn )) ≤ 0 ∀ xn → x,
limn (R(x) − Rn (xn )) = 0 for some xn → x.
(d) Convergence of innums
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
convergence
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Ãðàôè÷åñêàÿ ñõîäèìîñòü è åå ñëåäñòâèÿ,
Rockafellar and Wets (1998)
Ãðàôè÷åñêàÿ ñõîäèìîñòü ôóíêöèé è îòîáðàæåíèé
ýòî ñõîäèìîñòü ãðàôèêîâ gphFn
→ gphF
g
Fn → F
(êàê ìíîæåñòâ)
g
Fn → F :
(a) ∀xn → x ∈ X , Fnk (xnk ) 3 ynk → y ⇒ y ∈ F (x),
(b) ∀x ∈ X , y ∈ F (x) ñóùåñòâóåò xn → x òàêàÿ, ÷òî
limn F (xn ) = y .
Ãðàôè÷åñêàÿ ñõîäèìîñòü
Òåîðåìà: Ïóñòü ôóíêöèÿ
è îòîáðàæåíèÿ
g ,p
Gn → G
F : X → IR
ïîëóíåïðåðûâíà ñíèçó
ñõîäÿòñÿ ïîòî÷å÷íî è ãðàôè÷åñêè,
òîãäà ñõîäÿòñÿ è ìèíèìóìû:
inf
{x∈X : ~0∈Gn (x)}
F (x) →
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
inf
{x∈X : ~0∈G (x)}
F (x)
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Çàäà÷è ïðèíÿòèÿ ðåøåíèé: ÿçûê âêëþ÷åíèé
Âêëþ÷åíèÿ (îáîáùåííûå óðàâíåíèÿ):
íàéòè
x ∈ X : ~0 ∈ S(x)
ìíîãîçíà÷íîå îòîáðàæåíèå.
Âêëþ÷åíèÿ ÿâëÿþòñÿ îáîáùåíèÿìè (ñèñòåì) óðàâíåíèé:
~0 = S(x)
îäíîçíà÷íîå îòîáðàæåíèå (âåêòîð-ôóíêöèÿ).
Âêëþ÷åíèÿ ÿâëÿþòñÿ îáîáùåíèìè (ñèñòåì) íåðàâåíñòâ:
{x ∈ X : ~f (x) ≤ ~0} ⇐⇒ {x ∈ X : ~0 ∈ ~f (x) + IR m
}.
| {z +}
S(x)
Çàäà÷è îïòèìèçàöèè (ñ âêëþ÷åíèÿìè):
F (x ∗ ) = minx∈X F (x),
F (x ∗ ) = min{x∈X : ~0∈S(x)} F (x).
Íåîáõîäèìûå óñëîâèÿ ýêñòðåìóìà â âèäå âêëþ÷åíèé:
~0 ∈ ∂F (x) ñóáäèôôåðåíöèàë;
~0 ∈ ∂F (x) + NX (x) êîíóñ íîðìàëåé.
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Ñòîõàñòè÷åñêèå âêëþ÷åíèÿ
y ∈ Y:
X (y ) = {x ∈ X : 0 ∈ S(x, y )}
Âêëþ÷åíèÿ ñ ïàðàìåòðîì
Ñòîõàñòè÷åñêèå âêëþ÷åíèÿ (y ñëó÷àéíàÿ
X ∗ = {x ∈ X : 0 ∈ ES(x) := Ey S(x, y )}
íå òî æå, ÷òî {x ∈ X : 0 ∈ S(x, Eξ y )}.
âåëè÷èíà):
Àïïðîêñèìàöèÿ ìåòîäîì Ìîíòå-Êàðëî:
X n = {x ∈ X : 0 ∈ S n (x) :=
1
n
Pn
i=1 S(x, yi )}.
n
Âîïðîñ: ñõîäèìîñòü ìíîæåñòâ ðåøåíèé: X →
X ∗ ???
Îòâåò: ñõîäèìîñòü ðåøåíèé ñëåäóåò èç ãðàôè÷åñêîé
ò.å.
(íå
g
S n → ES ,
n
èç ñõîäèìîñòè ãðàôèêîâ gphS → gphES ,
n
òî æå, ÷òî ïîòî÷å÷íàÿ ñõîäèìîñòü S (x) → ES(x)).
ñõîäèìîñòè àïïðîêñèìàöèé:
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Çàêîíû áîëüøèõ ÷èñåë (
ÇÁ×)
ÇÁ× äëÿ íåçàâèñèìûõ ñëó÷àéíûõ âåëè÷èí (Êîëìîãîðîâ):
F n :=
1
n
Pn
i=1 f (yi )
→ Ey f (y )
(ñ âåð. 1).
Ðàâíîìåðíûé ÇÁ× (Ãëèâåíêî-Êàíòåëëè):
P
supx∈X n1 ni=1 f (x, yi ) → Ey f (x, y ) → 0
ñ âåð. 1.
(íå òî æå, ÷òî ïîòî÷å÷íûé çàêîí áîëüøèõ ÷èñåë:
1
n
Pn
i=1 f (x̄, yi )
→ Ey f (x̄, y )
äëÿ êàæäîãî
x̄
ñ âåð. 1).
Ðàâíîìåðíûé îäíîñòîðîííèé çàêîí áîëüøèõ ÷èñåë
Âàïíèêà-×åðâîíåíêèñà
Òåîðåìû î ñêîðîñòè êîíöåíòðàöèè â ÇÁ×
ÇÁ× äëÿ ñëó÷àéíûõ ìíîæåñòâ (Artstein and Vitale (1975),
Artstein and Hart (1981)):
Ýïè-ãðàôè÷åñêèé ÇÁ× äëÿ ôóíêöèé (Âåòñ è äð.
(1988-1996))
Ðàâíîìåðíûé ÇÁ× äëÿ ñëó÷àéíûõ îòîáðàæåíèé
(Molchanov (1999), Teran(2008)),
ïñåâäîðàâíîìåðíûé ÇÁ× (Shapiro and Xu (2007))
Ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë è åãî
(äëÿ ñëó÷àéíûõ îòîáðàæåíèé,
Norkin and Wets (2013)):
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
ïðèìåíåíèÿ
Ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë
äëÿ ìíîãîçíà÷íûõ îòîáðàæåíèé
Òåîðåìà (Norkin and Wets (2013)). Ïóñòü ñëó÷àéíûå
âåëè÷èíû
{ξ, ξ1 , ξ2 , ...}
è îòîáðàæåíèå
G : X × IR l → 2RI
m
óäîâëåòâîðÿþò óñëîâèÿìè:
{ξ, ξi } íåçàâèñèìû è îäèíàêîâî ðàñïðåäåëåíû;
(b) G ïîëóíåïðåðûâíî ñâåðõó ïî x ,
(c) G èìåðèìî ïî y ,
(d) supg ∈G (x,ξ) kg k ≤ K (ξ) èíòåãðèðóåìà.
(à)
Òîãäà
åäèíèöà
 ñ âåðîÿòíîñòüþ 





n
 1X
G (·, ξi )
gph

n

i=1


{z
}

 |






→ gph













Eξ conv {G (·, ξ)}
{z
}
|




.
Integrals of all selections



Sum by elements
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Îáñóæäåíèå ãðàôè÷åñêîãî çàêîíà áîëüøèõ ÷èñåë
Ïîòî÷å÷íûé ÇÁ× (Artstein and Vitale (1975), Artstein and
Hart (1981))
gph
1 Pn
n
i=1 G (x̄, ξi )
→ gph {Eξ conv {G (x̄, ξ)}}.
Âàïíèê, ×åâîíåíêèñ (1971, 1981): íåîáõîäèìûå è
äîñòàòî÷íûå óñëîâèÿ äëÿ ðàâíîìåðíîãî ÇÁ×
EntropyX (n)/n → 0, (äëÿ èíäèêàòîðíûõ
EpsilonEntropyXε (n)/n → 0,
supx∈X |f (x, ξ)| ≤ K (ξ) èíòåãðèðóåìà
ôóíêöèé)
Jennrich (1969): Ðàâíîìåðíûé çàêîí áîëüøèõ ÷èñåë
(äîñòàòî÷íûå óñëîâèÿ)
f (·, ξ)
íåïðåðûâíà íà êîìïàêòå
X , supx∈X |f (x, ξ)| ≤ K (ξ)
èíòåãðèðóåìà
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Çàäà÷à ñòîõàñòè÷åñêîé îïòèìèçàöèè
Çàäà÷à îïòèìèçàöèè:
F (x ∗ ) = minx∈X F (x)
Çàäà÷à îïòèìèçàöèè ñ (ñëó÷àéíûì) ïàðàìåòðîì
y:
F (x ∗ (y ), y ) = minx∈X F (x, y )
Çàäà÷à ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ
(ìèíèìèçàöèÿ îæèäàåìîãî ðèñêà):
F (x ∗ ) = minx∈X [F (x) = Ey F (x, y )]
Íå òî æå ñàìîå, ÷òî
F (x ∗∗ , Ey ) = minx∈X F (x, Ey )
Ñðåäíèé óùåðá (ðèñê)
y íå
Ey íàâîäíåíèé
íàâîäíåíèÿ
Ey F (x, y )
îò âñåõ âîçìîæíûõ
òî æå, ÷òî óùåðá îò îäíîãî ñðåäíåãî
Èòàê, êàê ðåøàòü? Ïðîáëåìà ñ âû÷èñëåíèåì ñðåäíåãî
Ey F (x, y )
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Ýìïèðè÷åñêàÿ àïïðîêñèìàöèÿ çàäà÷
ñòîõàñòè÷åñêîãî ïðîãðàììèðîâàíèÿ
Àïïðîêñèìèðóåì òåîðåòè÷åñêîå ñðåäíåå (ðèñê)
ýìèïèðè÷åñêèì
F n (x) =
1
n
Pn
Ey F (x, y )
i=1 F (x, yi ).
Ïðèõîäèì ê ïðèáëèæåííîé çàäà÷å
F n (x n ) = minx∈X F n (x) =
Ñõîäèìîñòü: Åñëè
ñõîäÿòñÿ ðåøåíèÿ
1
n
Pn
i=1 F (x, yi )
F n (x) =⇒ F (x)
x n → X ∗.
ðàâíîìåðíî íà
X ,òî
Ðàâíîìåðíûé (ñèëüíûé) çàêîí áîëüøèõ ÷èñåë,
supx∈X |F n (x) − F (x)| → 0
(ñ âåðîÿòíîñòüþ åäèíèöà),
íå òî æå ñàìîå, ÷òî ïîòî÷å÷íûé çàêîí áîëüøèõ ÷èñåë:
F n (x̄) → F (x̄)
ñ âåðîÿòíîñòüþ 1 äëÿ ëþáîãî
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
x̄ ∈ X .
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Îáîáùåíèÿ
x n → X ∗ íå îáÿçàòåëüíà
F n (x) =⇒ F (x), äîñòîòî÷íî
Äëÿ ñõîäèìîñòè ìèíèìóìîâ
ðàâíîìåðíàÿ ñõîäèìîñòü
(áîëåå ñëàáîé) ñõîäèìîñòè íàäãðàôèêîâ,
epi-gphF
n −→epi-gphF .
Ãðàôèê (gph) è íàäãðàôèê (epi-gph) ôóíêöèè:
gphF
= {(x, F (x)) : x ∈ X }
= {(x, z) : x ∈ X , z ≥ F (x)}
epi-gphF (x)
Ãðàôè÷åñêàÿ èëëþñòðàöèÿ (ñõîäèìîñòü íàäãðàôèêîâ) äëÿ
ðàçðûâíûõ ôóíêöèé
Ýïè-ãðàôè÷åñêèé çàêîí áîëüøèõ ÷èñåë (Âåòñ è äð. (1988,
1990, 1993, 1996)
epi-gph
1
n
Pn
i=1 F (·, yi )
−→ epi-gphEy F (·, y )
(ñõîäèìîñòü ìíîæåñòâ)
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Çàäà÷è ïðèíÿòèÿ ðåøåíèé: Äàëüíåéøèå îáîáùåíèÿ
Çàäà÷è ñ îãðàíè÷åíèÿìè â âèäå âêëþ÷åíèé:
F (x ∗ (y ), y ) = min{x∈X :0∈S(x,y )} F (x, y )
×àñòíûå ñëó÷àè:
x ∈ X (y ) èëè 0 ∈ S(x, y ) (îáîáùåííîå óðàâíåíèå)
F̂ (x ∗ (y )) = min{(x,z):x+z∈X (y )} [f (x, z) = F (x) + Ey λkzk]
(çàäà÷à äâóõýòàïíîãî ïðîãðàììèðîâàíèÿ)
Ñòîõàñòè÷åñêèå ïîñòàíîâêè:
0 ∈ Eξ S(x, ξ) (ñòîõàñòè÷åñêîå
F (x ∗ ) = min{x:0∈Eξ S(x,ξ)} F (x)
Âëàäèìèð Èâàíîâè÷ ÍÎÐÊÈÍ, Ðîäæåð Æ.-Â. ÂÅÒÑ
âêëþ÷åíèå)
Ãðàôè÷åñêèé Çàêîí Áîëüøèõ ×èñåë
è åãî ïðèìåíåíèÿ
Download