複合知能メディア研究室 mimlab
  • top
  • people
  • publications
  • research topics
  • contact
  • 日本語
  • Blog
    • 🔄 The Department of Intelligent Media
    • 🧑‍🎓 For Prospective Students
  • Research Topics
    • AIのバイアスとその低減
    • 大規模モデルの応用
    • 説明可能なAI
    • Vision and Language
  • Publications
    • No Annotations for Object Detection in Art through Stable Diffusion
    • PALADIN: Understanding Video Intentions in Political Advertisement Videos
    • Cross-modal Guided Visual Representation Learning for Social Image Retrieval
    • DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models
    • From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment
    • Learning More May Not Be Better: Knowledge Transferability in Vision-and-Language Tasks
    • Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes
    • A picture may be worth a hundred words for visual question answering
    • Is cardiovascular risk profiling from UK Biobank retinal images using explicit deep learning estimates of traditional risk factors equivalent to actual risk measurements? A prospective cohort study design
    • MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Subtle Clue Dynamics in Video Dialogues
    • Stable Diffusion Exposed: Gender Bias from Prompt to Image
    • Unleashing the Power of Contrastive Learning for Zero-Shot Video Summarization
    • Situating the social issues of image generation models in the model life cycle: a sociotechnical approach
    • Auditing Image-based NSFW Classifiers for Content Filtering
    • Exploring Emotional Stimuli Detection in Artworks: A Benchmark Dataset and Baselines Evaluation
    • GOYA: Leveraging Generative Art for Content-Style Disentanglement
    • Would Deep Generative Models Amplify Bias in Future Models?
    • Reproducibility Companion Paper: Stable Diffusion for Content-Style Disentanglement in Art Analysis
    • Retrieving Emotional Stimuli in Artworks
    • Instruct me more! Random prompting for visual in-context learning
    • Revisiting pixel-level contrastive pre-training on scene images
    • Societal Bias in Vision-and-Language Datasets and Models
    • Automatic evaluation of atlantoaxial subluxation in rheumatoid arthritis by a deep learning model
    • Enhancing Fake News Detection in Social Media via Label Propagation on Cross-Modal Tweet Graph
    • ACT2G: Attention-based Contrastive Learning for Text-to-Gesture Generation
    • Learning bottleneck concepts in image classification
    • Model-agnostic gender debiased image captioning
    • Multi-modal humor segment prediction in video
    • Not only generative art: Stable diffusion for content-style disentanglement in art analysis
    • Toward verifiable and reproducible human evaluation for text-to-image generation
    • Uncurated image-text datasets: Shedding light on demographic bias
    • Real-time estimation of the remaining surgery duration for cataract surgery using deep convolutional neural networks and long short-term memory
    • Improving facade parsing with vision transformers and line integration
    • Development of a vertex finding algorithm using recurrent neural network
    • Inference Time Evidences of Adversarial Attacks for Forensic on Transformers
    • Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization
    • Emotional Intensity Estimation based on Writer’s Personality
    • Deep Gesture Generation for Social Robots Using Type-Specific Libraries
    • Corpus Construction for Historical Newspapers: A Case Study on Public Meeting Corpus Construction Using OCR Error Correction
    • Depthwise spatio-temporal STFT convolutional neural networks for human action recognition
    • Match them up: Visually explainable few-shot image classification
    • Multi-label disengagement and behavior prediction in online learning
    • A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain
    • AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
    • Gender and racial bias in visual question answering datasets
    • Optimal Correction Cost for Object Detection Evaluation
    • Quantifying Societal Bias Amplification in Image Captioning
    • Tone Classification for Political Advertising Video using Multimodal Cues
    • Information Extraction from Public Meeting Articles
    • Anonymous identity sampling and reusable synthesis for sensitive face camouflage
    • Integration of gesture generation system using gesture library with DIY robot design kit
    • The semantic typology of visually grounded paraphrases
    • Explain me the painting: Multi-topic knowledgeable art description generation
    • GCNBoost: Artwork Classification by Label Propagation Through a Knowledge Graph
    • Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task Learning
    • SCOUTER: Slot attention-based classifier for explainable image recognition
    • Transferring domain-agnostic knowledge in video question answering
    • Built year prediction from Buddha face with heterogeneous labels
    • Visual question answering with textual representations for images
    • Learners' efficiency prediction using facial behavior analysis
    • Museum Experience into a Souvenir: Generating Memorable Postcards from Guide Device Behavior Log
    • PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation
    • Attending self-attention: A case study of visually grounded supervision in vision-and-language transformers
    • A comparative study of language Transformers for video question answering
    • MTUNet: Few-shot image classification with visual explanations
    • WRIME: A new dataset for emotional intensity estimation with subjective and objective annotations
    • Noisy-LSTM: Improving temporal awareness for video semantic segmentation
    • Generation and detection of media clones
    • Preventing fake information generation against media clone attacks
    • The laughing machine: Predicting humor in video
    • ContextNet: Representation and exploration for painting classification and retrieval in context
    • Cross-lingual visual grounding
    • IDSOU at WNUT-2020 Task 2: Identification of informative COVID-19 English tweets
    • Improving topic modeling through homophily for legal documents
    • Uncovering hidden challenges in query-based video moment retrieval
    • Visually grounded paraphrase identification via gating and phrase localization
    • A dataset and baselines for visual question answering on art
    • Demographic Influences on Contemporary Art with Unsupervised Style Embeddings
    • Knowledge-based video question answering with unsupervised scene descriptions
    • Privacy sensitive large-margin model for face de-identification
    • Joint learning of vessel segmentation and artery/vein classification with post-processing
    • Knowledge-Based Visual Question Answering in Videos
    • Yoga-82: A new dataset for fine-grained classification of human poses
    • Constructing a public meeting corpus
    • Warmer environments increase implicit mental workload even if learning efficiency is enhanced
    • BERT representations for video question answering
    • IterNet: Retinal image segmentation utilizing structural redundancy in vessel networks
    • Toward predicting learners' efficiency for adaptive e-learning
    • Video analytics in blended learning: Insights from learner-video interaction patterns
    • KnowIT VQA: Answering knowledge-based questions about videos
    • 3D image reconstruction from multi-focus microscopic images
    • Speech-driven face reenactment for a video sequence
    • Human shape reconstruction with loose clothes from partially observed data by pose specific deformation
    • Legal information as a complex network: Improving topic modeling through homophily
    • Adaptive gating mechanism for identifying visually grounded paraphrases
    • BUDA.ART: A multimodal content-based analysis and retrieval system for Buddha statues
    • Historical and modern features for Buddha statue classification
    • Facial expression recognition with skip-connection to leverage low-level features
    • Context-aware embeddings for automatic art analysis
    • Rethinking the evaluation of video summaries
    • Multimodal learning analytics: Society 5.0 project in Japan
    • Finding important people in a video using deep neural networks with conditional random fields
    • iParaphrasing: Extracting visually grounded paraphrases via an image
    • Iterative applications of image completion with CNN-based failure detection
    • Representing a partially observed non-rigid 3D human using eigen-texture and eigen-deformation
    • Summarization of user-generated sports video by using deep action recognition features
    • Augmented reality marker hiding with texture deformation
    • Realtime novel view synthesis with eigen-texture regression
    • Video question answering to find a desired video segment
    • Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD
    • Video summarization using textual descriptions for authoring video blogs
    • Increasing pose comprehension through augmented reality reenactment
    • ReMagicMirror: Action learning using human reenactment with the mirror metaphor
    • Flexible human action recognition in depth video sequences using masked joint trajectories
    • Video summarization using deep semantic features
    • Learning joint representations of videos and sentences with web image search
    • Human action recognition-based video summarization for RGB-D personal sports video
    • Privacy protection for social video via background estimation and CRF-based videographer's intention modeling
    • Novel View Synthesis Based on View-dependent Texture Mapping with Geometry-aware Color Continuity
    • 3D shape template generation from RGB-D images capturing a moving and deforming object
    • Evaluating protection capability for visual privacy information
    • Facial expression preserving privacy protection using image melding
    • Textual description-based video summarization for video blogs
    • AR image generation using view-dependent geometry modification and texture mapping
    • Protection and utilization of privacy information via sensing
    • Background estimation for a single omnidirectional image sequence captured with a moving camera
    • Free-viewpoint AR human-motion reenactment based on a single RGB-D video stream
    • Augmented reality image generation with virtualized real objects using view-dependent texture and geometry
    • Inferring what the videographer wanted to capture
    • Real-time privacy protection system for social videos using intentionally-captured persons detection
    • Markov random field-based real-time detection of intentionally-captured persons
    • Intended human object detection for automatically protecting privacy in mobile video surveillance
    • Extracting intentionally captured regions using point trajectories
    • Indoor positioning system using digital audio watermarking
    • Automatic generation of privacy-protected videos using background estimation
    • Automatically protecting privacy in consumer generated videos using intended human object detector
    • Discriminating intended human objects in consumer videos
    • Real-time user position estimation in indoor environments using digital watermarking for audio signals
    • Detecting intended human objects in human-captured videos
    • Digital diorama: Sensing-based real-world visualization
    • Watermarked movie soundtrack finds the position of the camcorder in a theater
    • Maximum-likelihood estimation of recording position based on audio watermarking
    • Determining Recording Location Based on Synchronization Positions of Audio watermarking
    • Estimation of recording location using audio watermarking
  • Projects
    • Pandas
    • PyTorch
    • scikit-learn

Automatically protecting privacy in consumer generated videos using intended human object detector

Oct 1, 2010·
Yuta Nakashima
,
Noboru Babaguchi
,
Jianping Fan
· 0 min read
Cite DOI
Type
Conference paper
Publication
Proc. ACM International Conference on Multimedia (MM)
Last updated on Oct 1, 2010

← Automatic generation of privacy-protected videos using background estimation Jul 1, 2011
Discriminating intended human objects in consumer videos Aug 1, 2010 →
Languages:
English
日本語

© 2025 大阪大学 産業科学研究科 第一研究部門 複合知能メディア研究分野

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.