The web provides an unprecedented o

The web provides an unprecedented opportunity to evaluate ideas
quickly using controlled experiments, also called randomized
experiments (single-factor or factorial designs), A/B tests (and
their generalizations), split tests, Control/Treatment, and parallel
flights. In the simplest manifestation of such experiments, live
http://exp-platform.com/hippo.aspx Page 2
users are randomly assigned to one of two variants: (i) the
Control, which is commonly the ―existing‖ version, and (ii) the
Treatment, which is usually a new version being evaluated.
Metrics of interest, ranging from runtime performance to implicit
and explicit user behaviors and survey data, are collected.
Statistical tests are then conducted on the collected data to
evaluate whether there is a statistically significant difference
between the two variants on metrics of interest, thus permitting us
to retain or reject the (null) hypothesis that there is no difference
between the versions. In many cases, drilling down to segments
of users using manual (e.g., OLAP) or machine learning and data
mining techniques, allows us to understand which subpopulations
show significant differences, thus helping improve our
understanding and progress forward with an idea.
Controlled experiments provide a methodology to reliably
evaluate ideas. Unlike other methodologies, such as post-hoc
analysis or interrupted time series (quasi experimentation) (5), this
experimental design methodology tests for causal relationships (6
pp. 5-6). Most organizations have many ideas, but the return-oninvestment
(ROI) for many may be unclear and the evaluation
itself may be expensive. As shown in the next section, even
minor changes can make a big difference, and often in unexpected
ways. A live experiment goes a long way in providing guidance
as to the value of the idea. Our contributions include the
following.
 In Section 3 we review controlled experiments in a web
environment and provide a rich set of references, including an
important review of statistical power and sample size, which
are often missing in primers. We then look at techniques for
reducing variance that we found useful in practice. We also
discuss extensions and limitations so that practitioners can
avoid pitfalls.
 In Section 4 we present generalized architectures that unify
multiple experimentation systems we have seen, and we discuss
their pros and cons. We show that some randomization and
hashing schemes fail conditional independence tests required
for statistical validity.
 In Section 5 we provide important practical lessons.
When a company builds a system for experimentation, the cost of
testing and experimental failure becomes small, thus encouraging

The web provides an unprecedented opportunity to evaluate ideas
quickly using controlled experiments, also called randomized
experiments (single-factor or factorial designs), A/B tests (and
their generalizations), split tests, Control/Treatment, and parallel
flights. In the simplest manifestation of such experiments, live 
http://exp-platform.com/hippo.aspx Page 2
users are randomly assigned to one of two variants: (i) the
Control, which is commonly the ―existing‖ version, and (ii) the
Treatment, which is usually a new version being evaluated.
Metrics of interest, ranging from runtime performance to implicit
and explicit user behaviors and survey data, are collected.
Statistical tests are then conducted on the collected data to
evaluate whether there is a statistically significant difference
between the two variants on metrics of interest, thus permitting us
to retain or reject the (null) hypothesis that there is no difference
between the versions. In many cases, drilling down to segments
of users using manual (e.g., OLAP) or machine learning and data
mining techniques, allows us to understand which subpopulations
show significant differences, thus helping improve our
understanding and progress forward with an idea.
Controlled experiments provide a methodology to reliably
evaluate ideas. Unlike other methodologies, such as post-hoc
analysis or interrupted time series (quasi experimentation) (5), this
experimental design methodology tests for causal relationships (6
pp. 5-6). Most organizations have many ideas, but the return-oninvestment
(ROI) for many may be unclear and the evaluation
itself may be expensive. As shown in the next section, even
minor changes can make a big difference, and often in unexpected
ways. A live experiment goes a long way in providing guidance
as to the value of the idea. Our contributions include the
following.
 In Section 3 we review controlled experiments in a web
environment and provide a rich set of references, including an
important review of statistical power and sample size, which
are often missing in primers. We then look at techniques for
reducing variance that we found useful in practice. We also
discuss extensions and limitations so that practitioners can
avoid pitfalls.
 In Section 4 we present generalized architectures that unify
multiple experimentation systems we have seen, and we discuss
their pros and cons. We show that some randomization and
hashing schemes fail conditional independence tests required
for statistical validity.
 In Section 5 we provide important practical lessons.
When a company builds a system for experimentation, the cost of
testing and experimental failure becomes small, thus encouraging

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

The web provides an unprecedented opportunity to evaluate ideasquickly using controlled experiments, also called randomizedexperiments (single-factor or factorial designs), A/B tests (andtheir generalizations), split tests, Control/Treatment, and parallelflights. In the simplest manifestation of such experiments, live http://exp-platform.com/hippo.aspx Page 2users are randomly assigned to one of two variants: (i) theControl, which is commonly the ―existing‖ version, and (ii) theTreatment, which is usually a new version being evaluated.Metrics of interest, ranging from runtime performance to implicitand explicit user behaviors and survey data, are collected.Statistical tests are then conducted on the collected data toevaluate whether there is a statistically significant differencebetween the two variants on metrics of interest, thus permitting usto retain or reject the (null) hypothesis that there is no differencebetween the versions. In many cases, drilling down to segmentsof users using manual (e.g., OLAP) or machine learning and datamining techniques, allows us to understand which subpopulationsshow significant differences, thus helping improve ourunderstanding and progress forward with an idea.Controlled experiments provide a methodology to reliablyevaluate ideas. Unlike other methodologies, such as post-hocanalysis or interrupted time series (quasi experimentation) (5), thisexperimental design methodology tests for causal relationships (6pp. 5-6). Most organizations have many ideas, but the return-oninvestment(ROI) for many may be unclear and the evaluationitself may be expensive. As shown in the next section, evenminor changes can make a big difference, and often in unexpectedways. A live experiment goes a long way in providing guidanceas to the value of the idea. Our contributions include thefollowing. In Section 3 we review controlled experiments in a webenvironment and provide a rich set of references, including animportant review of statistical power and sample size, whichare often missing in primers. We then look at techniques forreducing variance that we found useful in practice. We alsodiscuss extensions and limitations so that practitioners canavoid pitfalls. In Section 4 we present generalized architectures that unifymultiple experimentation systems we have seen, and we discusstheir pros and cons. We show that some randomization andhashing schemes fail conditional independence tests requiredfor statistical validity. In Section 5 we provide important practical lessons.When a company builds a system for experimentation, the cost oftesting and experimental failure becomes small, thus encouraging

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

เว็บให้โอกาสประวัติการณ์ที่จะประเมินความคิด
อย่างรวดเร็วโดยใช้การทดลองควบคุมที่เรียกว่าแบบสุ่ม
ทดลอง (ปัจจัยเดียวหรือการออกแบบปัจจัย) การทดสอบ A / B (และ
ภาพรวมของพวกเขา) การทดสอบแยกการควบคุม / การรักษาและขนาน
เที่ยวบิน ในการรวมตัวกันที่ง่ายที่สุดของการทดลองดังกล่าวอาศัยอยู่
http://exp-platform.com/hippo.aspx หน้า 2
ผู้ใช้จะได้รับการสุ่มให้เป็นหนึ่งในสองสายพันธุ์ (i)
การควบคุมซึ่งเป็นปกติรุ่น-existing‖และ (ii)
การรักษาซึ่งมักจะเป็นรุ่นใหม่ถูกประเมิน.
ตัวชี้วัดที่น่าสนใจมากมายจากการปฏิบัติงานจริงที่จะส่อ
พฤติกรรมของผู้ใช้และข้อมูลที่ชัดเจนและการสำรวจจะถูกเก็บรวบรวม.
การทดสอบทางสถิติแล้วจะดำเนินการในการเก็บรวบรวมข้อมูลเพื่อ
ประเมินว่ามี ความแตกต่างอย่างมีนัยสำคัญทางสถิติ
ระหว่างสองสายพันธุ์ในตัวชี้วัดที่น่าสนใจจึงอนุญาตให้เรา
เพื่อรักษาหรือปฏิเสธ (null) สมมติฐานที่ว่าไม่มีความแตกต่าง
ระหว่างรุ่น ในหลายกรณีการขุดเจาะลงไปที่ส่วน
ของผู้ใช้โดยใช้คู่มือ (เช่น OLAP) หรือการเรียนรู้เครื่องและข้อมูล
เทคนิคการทำเหมืองแร่ช่วยให้เราสามารถเข้าใจซึ่งประชากร
แสดงความแตกต่างอย่างมีนัยสำคัญจึงช่วยให้เราปรับปรุง
ความเข้าใจและความคืบหน้าไปข้างหน้าด้วยความคิด.
การทดลองควบคุมให้ วิธีการที่เชื่อถือได้
ประเมินความคิด ซึ่งแตกต่างจากวิธีการอื่น ๆ เช่นการโพสต์-hoc
หรือการวิเคราะห์อนุกรมเวลาขัดจังหวะ (กึ่งทดลอง) (5) นี้
การทดสอบวิธีการออกแบบการทดลองสำหรับความสัมพันธ์เชิงสาเหตุ (6
ได้ pp. 5-6) องค์กรส่วนใหญ่มีความคิดจำนวนมาก แต่ผลตอบแทน oninvestment
(ROI) สำหรับหลาย ๆ คนอาจจะยังไม่ชัดเจนและการประเมินผล
ตัวเองอาจจะมีราคาแพง ดังแสดงในส่วนถัดไปแม้
การเปลี่ยนแปลงเล็กน้อยสามารถสร้างความแตกต่างใหญ่และมักจะอยู่ในที่ไม่คาดคิด
วิธี การทดลองถ่ายทอดสดไปทางยาวในการให้คำแนะนำที่
เป็นมูลค่าของความคิด ผลงานของเรารวมถึง
ต่อไปนี้.
ในมาตรา 3 ให้เราตรวจสอบควบคุมการทดลองในเว็บ
สภาพแวดล้อมและให้ชุดสมบูรณ์ของการอ้างอิงรวมทั้ง
การตรวจสอบที่สำคัญของการใช้พลังงานทางสถิติและขนาดของกลุ่มตัวอย่างซึ่ง
มักจะขาดหายไปในไพรเมอร์ จากนั้นเราจะดูที่เทคนิคในการ
ลดความแปรปรวนที่เราพบว่ามีประโยชน์ในการปฏิบัติ นอกจากนี้เรายัง
หารือเกี่ยวกับการขยายและข้อ จำกัด เพื่อให้ผู้ปฏิบัติสามารถ
หลีกเลี่ยงข้อผิดพลาด.
ในมาตรา 4 เรานำเสนอสถาปัตยกรรมทั่วไปที่รวม
ระบบการทดลองหลาย ๆ ที่เราได้เห็นและเราหารือเกี่ยวกับ
ข้อดีและข้อเสียของพวกเขา เราแสดงให้เห็นว่าบางอย่างสุ่มและ
แผนการคร่ำเครียดล้มเหลวการทดสอบความเป็นอิสระตามเงื่อนไขที่จำเป็น
เพื่อความถูกต้องทางสถิติ.
ในมาตรา 5 ที่เรามีให้บทเรียนที่สำคัญในทางปฏิบัติ.
เมื่อ บริษัท สร้างระบบสำหรับการทดลองค่าใช้จ่ายของ
การทดสอบและความล้มเหลวในการทดลองจะกลายเป็นขนาดเล็กจึงให้กำลังใจ

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

เว็บเปิดโอกาสประวัติการณ์เพื่อประเมินความคิด
อย่างรวดเร็วโดยใช้การทดลองการควบคุมที่เรียกว่าการทดลองสุ่ม
( ปัจจัยเดียวหรือแบบแฟคทอเรียล ) , การทดสอบ A / B (
ทั่วไป ) , การทดสอบแยก , การควบคุม / การรักษา , และเที่ยวบินขนาน

ในเครื่องที่ง่ายที่สุดของการทดลองดังกล่าว อยู่

http://exp-platform.com/hippo.aspx 2 หน้าผู้ใช้แบบสุ่มหนึ่งสองตัวแปร : ( i )
ควบคุม ซึ่งโดยทั่วไปแล้วผมอยาก‖รุ่นที่มีอยู่ และ ( ii )
รักษา ซึ่งโดยปกติจะเป็นรุ่นใหม่ที่ถูกประเมิน
เมตริกของดอกเบี้ย ตั้งแต่ประสิทธิภาพรันไทม์ไปโดยปริยาย
และพฤติกรรมผู้ใช้ชัดเจน และข้อมูลการสำรวจเป็น รวบรวม การทดสอบทางสถิติแล้ว

ขึ้นอยู่กับข้อมูลประเมิน ว่ามีความแตกต่างกันระหว่างสองตัวแปรเป็น

สนใจ จึงอนุญาตให้เราเพื่อรักษาหรือปฏิเสธ ( null ) พบว่าไม่มีความแตกต่าง
ระหว่างรุ่น ในหลายกรณี เจาะลงไปที่กลุ่ม
ของผู้ใช้คู่มือ ( เช่น OLAP ) หรือการเรียนรู้ของเครื่องและข้อมูล
เทคนิคการทำเหมือง ให้เราเข้าใจ ซึ่งแต่ละ
แสดงความแตกต่างอย่างมีนัยสำคัญจึงช่วยเพิ่มความเข้าใจของเรา
และความคืบหน้าไปข้างหน้าพร้อมกับความคิด วิธีการควบคุมการทดลองให้

จะได้ประเมินความคิด ซึ่งแตกต่างจากวิธีการอื่น ๆ เช่น Post Hoc
การวิเคราะห์อนุกรมเวลา หรือขัดจังหวะ ( กึ่งการทดลอง ( 5 ) ,
ทดลองวิธีการออกแบบทดสอบความสัมพันธ์เชิงเหตุผล ( 6
. 5-6 )องค์กรส่วนใหญ่มีความคิดมากมาย แต่กลับ oninvestment
( ROI ) หลายคนอาจจะไม่ชัดเจน และประเมิน
เองอาจจะแพง ตามที่แสดงในส่วนถัดไป , แม้
เปลี่ยนแปลงเล็กน้อยสามารถสร้างความแตกต่างใหญ่และมักจะในรูปแบบที่ไม่คาดคิด

มีการทดลองไปอยู่ได้นานในการให้คำแนะนำ
เป็นคุณค่าของความคิด ผลงานของเรารวมถึง

ต่อไปนี้ในมาตรา 3 ที่เราทบทวนการทดลองการควบคุมในสภาพแวดล้อมเว็บ
และให้ตั้งมากมายของการอ้างอิงรวมทั้ง
ทบทวนความสำคัญของพลังงานเชิงสถิติ และขนาดตัวอย่างซึ่ง
มักจะหายไปในรองพื้น จากนั้นเราจะดูเทคนิค
ลดความแปรปรวนที่เราพบเป็นประโยชน์ในการปฏิบัติงาน นอกจากนี้เรายังหารือเกี่ยวกับส่วนขยายและข้อจำกัดเพื่อให้

ผู้ปฏิบัติงานสามารถหลีกเลี่ยงข้อผิดพลาดมาตรา 4 ที่เรานำเสนอแบบสถาปัตยกรรมที่รวม
ระบบการทดลองหลายที่เราได้เห็น และเราพูด
ข้อดีและข้อเสียของพวกเขา . เราแสดงให้เห็นว่าบางประเทศทั่วโลกแผนการล้มเหลวการทดสอบที่ต้องการ

hashing เนื้อหาสถิติความเป็นอิสระที่มีเงื่อนไข .
ในส่วนที่ 5 เราให้บทเรียนที่สำคัญในทางปฏิบัติ
เมื่อ บริษัท สร้างระบบสำหรับการทดลอง ต้นทุนของ
การทดสอบและทดลองล้มเหลวจะกลายเป็นเล็ก เล็กจึง

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.