The recognition and retrieval of ac

The recognition and retrieval of actions in videos is challenging due to the need to
handle many sources of variations: viewpoint, size and appearance of actors, scene
lighting and video quality, etc. In this paper we introduce a novel action representation
based on motion dynamics that is robust to such variations.
Currently, state-of-the-art performance in action classification is achieved by extracting
dense local features (HOG, HOF, MBH) and grouping them in a bag-of-features
(BOF) framework [26]. The basic BOF representation ignores information about the
spatial and temporal arrangement of the local features by pooling them over the entire
video volume. More recently, it has been shown that considering the spatial and
temporal arrangements (dynamics) of an action (eg. extracting separate BOF model for
each subvolume of a video [14,26] or modelling the spatio-temporal arrangements of
the interest points [29]) adds more discriminative power to the representation.
Our approach is based on the observation that the dynamics of an action provide a
powerful cue for discrimination. In Johansson’smoving light display experiment, it was
shown that humans perceive actions by abstracting a coherent structure from the spatiotemporal
pattern of local movements [9]. While humans respond to both spatial and
temporal information, the spatial configuration of movements that comprise an action is
strongly affected by changes in viewpoint. This suggests that representing the temporal
structure of an action could be valuable for reducing the effect of viewpoint. Motivated
by this observation, we define human actions as a composition of temporal patterns of
movements.

0/5000

จาก: -

เป็น: -

ผลลัพธ์ (ไทย) 1: [สำเนา]

คัดลอก!

The recognition and retrieval of actions in videos is challenging due to the need tohandle many sources of variations: viewpoint, size and appearance of actors, scenelighting and video quality, etc. In this paper we introduce a novel action representationbased on motion dynamics that is robust to such variations.Currently, state-of-the-art performance in action classification is achieved by extractingdense local features (HOG, HOF, MBH) and grouping them in a bag-of-features(BOF) framework [26]. The basic BOF representation ignores information about thespatial and temporal arrangement of the local features by pooling them over the entirevideo volume. More recently, it has been shown that considering the spatial andtemporal arrangements (dynamics) of an action (eg. extracting separate BOF model foreach subvolume of a video [14,26] or modelling the spatio-temporal arrangements ofthe interest points [29]) adds more discriminative power to the representation.Our approach is based on the observation that the dynamics of an action provide apowerful cue for discrimination. In Johansson’smoving light display experiment, it wasshown that humans perceive actions by abstracting a coherent structure from the spatiotemporalpattern of local movements [9]. While humans respond to both spatial andtemporal information, the spatial configuration of movements that comprise an action isstrongly affected by changes in viewpoint. This suggests that representing the temporalstructure of an action could be valuable for reducing the effect of viewpoint. Motivated
by this observation, we define human actions as a composition of temporal patterns of
movements.

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 2:[สำเนา]

คัดลอก!

การรับรู้และการดึงของการกระทำในวิดีโอเป็นสิ่งที่ท้าทายเนื่องจากการจำเป็นที่จะต้องจัดการกับหลายแหล่งที่มาของการเปลี่ยนแปลง: มุมมองขนาดและการปรากฏตัวของนักแสดงฉากแสงและคุณภาพของวิดีโอและอื่นๆ ในบทความนี้เราแนะนำเป็นตัวแทนดำเนินการนวนิยายขึ้นอยู่กับการเปลี่ยนแปลงของการเคลื่อนไหวที่มีประสิทธิภาพต่อการเปลี่ยนแปลงดังกล่าว. ขณะนี้ผลการดำเนินงานรัฐของศิลปะในการจำแนกการกระทำที่จะทำได้โดยการสกัดคุณลักษณะท้องถิ่นหนาแน่น (HOG, ฮอฟ, MBH) และการจัดกลุ่มไว้ในถุงของคุณสมบัติ (BOF) กรอบ [26 ] การเป็นตัวแทน BOF พื้นฐานไม่สนใจข้อมูลเกี่ยวกับการจัดพื้นที่และเวลาของคุณสมบัติท้องถิ่นโดยร่วมกันพวกเขามากกว่าทั้งปริมาณวิดีโอ เมื่อเร็ว ๆ นี้จะได้รับการแสดงให้เห็นว่าการพิจารณาเชิงพื้นที่และการเตรียมการชั่วคราว(การเปลี่ยนแปลง) ของการกระทำ (เช่น. สกัด BOF รูปแบบที่แยกต่างหากสำหรับแต่ละ subvolume ของวิดีโอ [14,26] หรือการสร้างแบบจำลองการจัด spatio กาลของจุดที่น่าสนใจ[ 29]) ได้เพิ่มอำนาจจำแนกมากขึ้นในการเป็นตัวแทน. วิธีการของเราจะขึ้นอยู่กับการสังเกตว่าการเปลี่ยนแปลงของการกระทำให้คิวที่มีประสิทธิภาพสำหรับการเลือกปฏิบัติ ในการแสดงผลการทดลอง Johansson'smoving แสงมันก็แสดงให้เห็นว่าการกระทำของมนุษย์รับรู้โดยสรุปโครงสร้างการเชื่อมโยงกันจากspatiotemporal รูปแบบของการเคลื่อนไหวในท้องถิ่น [9] ในขณะที่มนุษย์ตอบสนองต่อทั้งเชิงพื้นที่และข้อมูลชั่วคราวการกำหนดค่าเชิงพื้นที่ของการเคลื่อนไหวที่ประกอบด้วยการดำเนินการจะได้รับผลกระทบอย่างรุนแรงจากการเปลี่ยนแปลงในมุมมอง นี้แสดงให้เห็นว่าเป็นตัวแทนของชั่วคราวโครงสร้างของการกระทำที่จะเป็นประโยชน์ในการลดผลกระทบของมุมมอง แรงบันดาลใจโดยการสังเกตนี้เรากำหนดการกระทำของมนุษย์เป็นองค์ประกอบของรูปแบบชั่วขณะของการเคลื่อนไหว

การแปล กรุณารอสักครู่..

ผลลัพธ์ (ไทย) 3:[สำเนา]

คัดลอก!

การรับรู้และการกระทำในวิดีโอเป็นสิ่งที่ท้าทายเนื่องจากต้องการ
จัดการแหล่งข้อมูลหลายรูปแบบ : มุมมอง , ขนาดและลักษณะที่ปรากฏของนักแสดง , แสงฉาก
และคุณภาพของวิดีโอ ฯลฯ ในกระดาษนี้เราแนะนำนวนิยายกระทำแทน
ขึ้นอยู่กับพลวัตการเคลื่อนไหวที่แข็งแกร่งในรูปแบบต่าง ๆเช่น
ในปัจจุบันประสิทธิภาพในการเป็นรัฐ - of - the - art การจำแนกทำได้โดยการสกัด
คุณสมบัติท้องถิ่นหนาแน่น ( หมู Hof mbh และการจัดกลุ่ม , ) ในถุงสมบัติ
( BOF ) กรอบ [ 26 ] การแสดง BOF พื้นฐานละเว้นข้อมูลเกี่ยวกับพื้นที่และจัด
กาลคุณสมบัติท้องถิ่น โดยร่วมกันพวกเขามากกว่าปริมาณวิดีโอทั้งหมด

เมื่อเร็วๆ นี้จะได้รับการแสดงให้เห็นว่าการพิจารณาพื้นที่และ
จัดชั่วคราว ( พลศาสตร์ ) ของการกระทำ ( เช่นการสกัดแยกต่างหากสำหรับแต่ละรูปแบบ BOF
subvolume ของวิดีโอ [ 14,26 ] หรือการจัดเรียงเชิงพื้นที่และเวลาของ
จุดที่น่าสนใจ [ 29 ] ) เพิ่มค่าพลังงานแทน .
วิธีการของเราอยู่บนพื้นฐานของการสังเกต ที่พลวัตของการกระทำให้
คิวที่มีประสิทธิภาพสำหรับการเลือกปฏิบัติ ใน johansson'smoving การทดลองแสดงแสง มันแสดงให้เห็นว่ามนุษย์
รับรู้การกระทำโดยการสรุปโครงสร้างที่สอดคล้องกันจากแบบแผน spatiotemporal
ของความเคลื่อนไหวในท้องถิ่น [ 9 ] ในขณะที่มนุษย์ตอบสนองทั้งเชิงพื้นที่และข้อมูลชั่วคราว
, การตั้งค่าพื้นที่การเคลื่อนไหวที่ประกอบด้วยการกระทำ
ขอผลกระทบจากการเปลี่ยนแปลงในมุมมอง .นี้แสดงให้เห็นว่าเป็นตัวแทนของโครงสร้างชั่วคราว
แอคชั่นอาจมีค่าสำหรับการลดผลกระทบจากจุดชมวิว แรงจูงใจ
โดยการสังเกตนี้ เรากำหนดการกระทําของมนุษย์เป็นองค์ประกอบของรูปแบบกาล
การเคลื่อนไหว

การแปล กรุณารอสักครู่..

ภาษาอื่น ๆ

การสนับสนุนเครื่องมือแปลภาษา: กรีก, กันนาดา, กาลิเชียน, คลิงออน, คอร์สิกา, คาซัค, คาตาลัน, คินยารวันดา, คีร์กิซ, คุชราต, จอร์เจีย, จีน, จีนดั้งเดิม, ชวา, ชิเชวา, ซามัว, ซีบัวโน, ซุนดา, ซูลู, ญี่ปุ่น, ดัตช์, ตรวจหาภาษา, ตุรกี, ทมิฬ, ทาจิก, ทาทาร์, นอร์เวย์, บอสเนีย, บัลแกเรีย, บาสก์, ปัญจาป, ฝรั่งเศส, พาชตู, ฟริเชียน, ฟินแลนด์, ฟิลิปปินส์, ภาษาอินโดนีเซี, มองโกเลีย, มัลทีส, มาซีโดเนีย, มาราฐี, มาลากาซี, มาลายาลัม, มาเลย์, ม้ง, ยิดดิช, ยูเครน, รัสเซีย, ละติน, ลักเซมเบิร์ก, ลัตเวีย, ลาว, ลิทัวเนีย, สวาฮิลี, สวีเดน, สิงหล, สินธี, สเปน, สโลวัก, สโลวีเนีย, อังกฤษ, อัมฮาริก, อาร์เซอร์ไบจัน, อาร์เมเนีย, อาหรับ, อิกโบ, อิตาลี, อุยกูร์, อุสเบกิสถาน, อูรดู, ฮังการี, ฮัวซา, ฮาวาย, ฮินดี, ฮีบรู, เกลิกสกอต, เกาหลี, เขมร, เคิร์ด, เช็ก, เซอร์เบียน, เซโซโท, เดนมาร์ก, เตลูกู, เติร์กเมน, เนปาล, เบงกอล, เบลารุส, เปอร์เซีย, เมารี, เมียนมา (พม่า), เยอรมัน, เวลส์, เวียดนาม, เอสเปอแรนโต, เอสโทเนีย, เฮติครีโอล, แอฟริกา, แอลเบเนีย, โคซา, โครเอเชีย, โชนา, โซมาลี, โปรตุเกส, โปแลนด์, โยรูบา, โรมาเนีย, โอเดีย (โอริยา), ไทย, ไอซ์แลนด์, ไอร์แลนด์, การแปลภาษา.