版塊導(dǎo)航: 正在加載中...

應(yīng)《網(wǎng)絡(luò)安全法》要求，自2017年10月1日起，未進(jìn)行實(shí)名認(rèn)證將不得使用互聯(lián)網(wǎng)跟帖服務(wù)。為保障您的帳號(hào)能夠正常使用，請(qǐng)盡快對(duì)帳號(hào)進(jìn)行手機(jī)號(hào)驗(yàn)證，感謝您的理解與支持！

24小時(shí)熱門版塊排行榜

返回列表

【懸賞金幣】回答本帖問題，作者h(yuǎn)e1wen2zhi將贈(zèng)送您 5 個(gè)金幣

he1wen2zhi

新蟲 (初入文壇)

應(yīng)助: 0 (幼兒園)
金幣: 33
帖子: 8
在線: 37分鐘
蟲號(hào): 34033718
注冊(cè): 2023-08-08
專業(yè): 自然語言理解與機(jī)器翻譯

[求助] 大修20天，3個(gè)審稿人，求教大佬們我該怎么改已有1人參與

求教大佬我該側(cè)重哪方面改
要加什么實(shí)驗(yàn)?zāi)?br /> 第三個(gè)審稿人說我沒用自己提的數(shù)據(jù)集做實(shí)驗(yàn)，實(shí)際上我論文中寫了我用了我的數(shù)據(jù)集做的訓(xùn)練，我該怎么合理回復(fù)呢
求求大佬們給我些建議

三條審稿意見如下：
Reviewer #1: This paper presents an audio-visual cross-modality generation method for talking face videos with rhythmic head.
The studied topic is meaningful.
The authors are suggested to further improve the paper from the following aspects.

The quality evaluation of the generated audio-visual talking heads is very important for the method design. The authors have used some criteria for evaluation. The authors may give some discussions on whether it is possible to use some quality assessment methods for evaluation. For example using the audio-visual quality assessment methods proposed in 'Study of subjective and objective quality assessment of audio-visual signals', 'Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment' for evaluation.
The authors are suggested to give some discussions on this aspect and the above works.

'The proposed method demonstrates improved performance in terms of video quality compared to traditional approaches'
Some discussions about visual quality assessment are suggested to be given here, considering that there are many visual quality assessment studies in the literatures, for example, 'Blind quality assessment based on pseudo-reference image', 'Blind image quality estimation via distortion aggravation', 'Unified blind quality assessment of compressed natural, graphic, and screen content images', 'Objective quality evaluation of dehazed images', 'Quality evaluation of image dehazing methods using synthetic hazy images'.

Following the above comments, the quality assessment of multimedia signals is also highly relevant to this work, thus some surveys for quality assessment are suggested to be given in the introduction section of the paper, for example, 'Perceptual image quality assessment: a survey', 'Screen content quality assessment: overview, benchmark, and beyond'.

Audio-visual attention is critical for various audio-visual applications. Many audio-visual attention prediction methods have been proposed, for example, 'A multimodal saliency model for videos with high audio-visual correspondence', 'Fixation prediction through multimodal analysis'. The authors may give some discussions on the possibility of using audio-visual attention prediction methods to improve the proposed method.
The authors are suggested to give some discussions on this aspect and the above works.

Reviewer #2: This paper addresses the generation of realistic talking facial videos by incorporating audio and head pose information. Existing methods lack natural head pose generation and audio synchronization, impacting video realism. The authors propose Flow2Flow, an autoregressive method that encodes audio and historical head poses using a multimodal transformer block with cross-attention. They introduce AVVS, a large-scale dataset for investigating rhythmic head movement patterns. The proposed method generates identity-independent facial motion representations, enabling photo-realistic videos with natural head poses and accurate lip-syncing, as demonstrated through experiments and comparisons with state-of-the-art approaches on public datasets. However, some concerns should be addressed.

The organization of the paper could benefit from improvements, e.g., some video synthesis part is introduced in the feature encoding part.

The authors pointed out that the full attention structure in the model excessively focuses on a single source during integration, leading to the neglect of crucial information from other modalities. As a result, accurately generating movements for the facial generation task becomes challenging. It would be helpful to provide supporting evidence or examples to further illustrate this issue.

Instead of delving into the intricacies of flow theory, it would be more beneficial to focus on incorporating references in the facial attribute generation process.

The model utilizes 15 neutral keypoints as facial attributes. It would be valuable for the authors to explore the impact of varying the number of keypoints and investigate whether incorporating certain 3DMM parameters and other types of audio features would enhance the results.

The authors have primarily focused on discussing the applications of common loss functions. However, IQA models also have the wide-ranging applications in evaluating generative image, video, audio, and multimedia models, e.g., "Blind image quality assessment via cross-view consistency" and "Comparative perceptual assessment of visual signals using free energy features." The authors are suggested to give some discussions on this aspect and the above works. Additionally, considering the significance of attention mechanism, the authors are encouraged to provide discussions on related works like "Toward visual behavior and attention understanding for augmented 360-degree videos," "Viewing behavior supported visual saliency predictor for 360-degree videos," and "Learning a deep agent to predict head movement in 360-degree images."

Reviewer #3: introductions:
This paper proposes a normalizing flow based network to generate realistic talking face videos, by using audio and past head poses as inputs.
Besides, they also contributes a solo-singing-themed audio-visual dataset called AVVS for research.

Strength:
1. Experimental results do show that their methods can generate photo realistic videos with natural head poses and lip-syncing. And the
performance looks good.
2. Utilizing normalizing flow model is novel and convincing.

Weakness:
1. It is kind of stange that I do not see any experiments on AVVS dataset. Since you are proposing a dataset, I think some experiments should
be conducted on it.

回復(fù)此樓

» 猜你喜歡

A區(qū)一本交叉課題組，低分調(diào)劑，招收機(jī)械電子信息通信等交叉方向已經(jīng)有19人回復(fù)
085600材料與化工，一志愿廣州985，求調(diào)劑已經(jīng)有3人回復(fù)
材料學(xué)學(xué)碩308分/本科東北大學(xué)/一志愿西北工業(yè)大學(xué)/ 已經(jīng)有8人回復(fù)
334求調(diào)劑已經(jīng)有4人回復(fù)
0856材料與化工求調(diào)劑！已經(jīng)有8人回復(fù)
一志愿武理085601專碩347分求調(diào)劑已經(jīng)有4人回復(fù)
招調(diào)劑已經(jīng)有6人回復(fù)
材料與化工，291，求調(diào)劑已經(jīng)有6人回復(fù)
求調(diào)劑，學(xué)校研究所都可以，材料與化工267分已經(jīng)有4人回復(fù)
332材料求調(diào)劑已經(jīng)有4人回復(fù)

1樓 2023-08-10 14:57:53

已閱回復(fù)此樓關(guān)注TA 給TA發(fā)消息送TA紅花 TA的回帖

半生夢(mèng)君

新蟲 (職業(yè)作家)

應(yīng)助: 0 (幼兒園)
金幣: 3019.9
散金: 200
沙發(fā): 112
帖子: 3507
在線: 639.6小時(shí)
蟲號(hào): 33637970
注冊(cè): 2023-04-23
專業(yè): 泛函分析

對(duì)于第三個(gè)審稿人，首先表示感謝他的評(píng)論，然后列舉你論文中所用的你的數(shù)據(jù)集。

發(fā)自小木蟲Android客戶端

贊一下

回復(fù)此樓

2樓2023-08-10 15:01:44

已閱回復(fù)此樓關(guān)注TA 給TA發(fā)消息送TA紅花 TA的回帖

nono2009

超級(jí)版主 (文學(xué)泰斗)

No gains, no pains.

專家經(jīng)驗(yàn): +21105
SEPI: 10
應(yīng)助: 28684 (院士)
貴賓: 513.911
金幣: 2555220
散金: 27828
紅花: 2147
沙發(fā): 66666
帖子: 1602255
在線: 65200.9小時(shí)
蟲號(hào): 827383
注冊(cè): 2009-08-13
性別: GG
專業(yè): 工程熱物理與能源利用
管轄: 科研家籌備委員會(huì)

【答案】應(yīng)助回帖

感謝參與，應(yīng)助指數(shù) +1

能改的盡量改，不能改的誠(chéng)懇說明。

發(fā)自小木蟲Android客戶端

贊一下

回復(fù)此樓

3樓2023-08-11 07:24:56

已閱回復(fù)此樓關(guān)注TA 給TA發(fā)消息送TA紅花 TA的回帖

Mr_jianye

新蟲 (正式寫手)

應(yīng)助: 0 (幼兒園)
金幣: 513
散金: 1171
沙發(fā): 1
帖子: 615
在線: 64.2小時(shí)
蟲號(hào): 8529537
注冊(cè): 2018-04-15
性別: GG
專業(yè): 人工晶體

能改的改了

發(fā)自小木蟲Android客戶端

回復(fù)此樓

4樓2024-01-04 12:11:30

已閱回復(fù)此樓關(guān)注TA 給TA發(fā)消息送TA紅花 TA的回帖

相關(guān)版塊跳轉(zhuǎn) 我要訂閱樓主 he1wen2zhi 的主題更新

返回列表

不應(yīng)助 確定回帖應(yīng)助 (注意：應(yīng)助才可能被獎(jiǎng)勵(lì)，但不允許灌水，必須填寫15個(gè)字符以上)

普通表情龍兔虎貓

最具人氣熱帖推薦 [查看全部]		作者	回/看	最后發(fā)表

[考研] 招調(diào)劑 +4	帆船哥 2026-03-04	6/300	2026-03-05 15:50 by asore
[考研] 一志愿天津大學(xué)085600 319分材料與化工金屬方向求調(diào)劑 +6	青科11 2026-03-02	6/300	2026-03-05 10:05 by oxidpl
[考研] 【求調(diào)劑】293分環(huán)境工程求調(diào)劑材料/化工，服從調(diào)劑，抗壓能力強(qiáng)！ +6	xiiiia 2026-03-04	6/300	2026-03-05 09:44 by houyaoxu
[考研] 一志愿西交化工專碩288專業(yè)課93求調(diào)劑 +6	好運(yùn)好運(yùn)接接 2026-03-04	6/300	2026-03-05 09:36 by houyaoxu
[考研] 0703化學(xué) 學(xué)碩理工科均可不區(qū)分研究方向總分279求調(diào)劑 +7	1一11 2026-03-03	7/350	2026-03-05 09:23 by xwxstudy
[考研] 085600材料與化工（高分子）290分求調(diào)劑 +8	wengyujian 2026-03-04	8/400	2026-03-05 08:37 by 學(xué)員8dgXkO
[考研] 316求調(diào)劑 +3	林小星發(fā)大財(cái) 2026-03-04	3/150	2026-03-05 07:49 by bxbo
[考研] 264求調(diào)劑 +8	26調(diào)劑 2026-03-03	8/400	2026-03-04 20:50 by 一切OK
[考研] 292求調(diào)劑 +9	yhk_819 2026-02-28	9/450	2026-03-04 16:06 by sslc1985
[考研] 主題 +3	realstar2006 2026-02-27	3/150	2026-03-03 21:39 by AC.
[基金申請(qǐng)] 請(qǐng)問大家，研究風(fēng)險(xiǎn)與應(yīng)對(duì)措施那里，大家都怎么寫呢？ +3	cauasen 2026-03-02	3/150	2026-03-03 19:43 by Tide man
[碩博家園] 2025屆雙非化工碩士畢業(yè)，申博 +4	更多的是 2026-02-27	5/250	2026-03-03 17:44 by wenium
[考研] 清華大學(xué) 材料與化工 353分求調(diào)劑 +5	awaystay 2026-03-02	6/300	2026-03-03 09:03 by houyaoxu
[考研] 272求調(diào)劑 +9	材紫有化 2026-02-28	9/450	2026-03-02 20:22 by hypershenger
[考研] 0856求調(diào)劑285 +11	呂仔龍 2026-02-28	11/550	2026-03-02 20:15 by hypershenger
[考研] 化學(xué)，材料，環(huán)境類求調(diào)劑 +7	考研版棒棒 2026-03-02	7/350	2026-03-02 19:56 by hypershenger
[考研] 275求調(diào)劑 +7	明遠(yuǎn)求學(xué) 2026-03-01	7/350	2026-03-02 19:22 by zhukairuo
[考研] 一志愿華南理工大學(xué)材料與化工326分，求調(diào)劑 +3	wujinrui1 2026-02-28	3/150	2026-03-02 16:36 by chuocheng
[考研] 材料學(xué)調(diào)劑 +10	提神豆沙包 2026-02-28	12/600	2026-03-02 09:26 by 李老師！
[考研] 調(diào)劑 +3	簡(jiǎn)木ChuFront 2026-02-28	3/150	2026-03-01 11:46 by 王偉要上岸啊

亭亭五月天在线观看,亭亭五月天在线观看,国产最新av一区二区,国产 高清 中文字幕,99re热久久亚洲综合精品成人,熟妇 一区二区三区,一级做a爰片性色毛片武则天,美女的骚穴视频播放,国产美女午夜免费视频

24小時(shí)熱門版塊排行榜

he1wen2zhi

[求助] 大修20天，3個(gè)審稿人，求教大佬們我該怎么改 已有1人參與

» 猜你喜歡

半生夢(mèng)君

nono2009

【答案】應(yīng)助回帖

Mr_jianye

亭亭五月天在线观看,亭亭五月天在线观看,国产最新av一区二区,国产高清中文字幕,99re热久久亚洲综合精品成人,熟妇一区二区三区,一级做a爰片性色毛片武则天,美女的骚穴视频播放,国产美女午夜免费视频

[求助] 大修20天，3個(gè)審稿人，求教大佬們我該怎么改已有1人參與