logo
Main

Forums

Downloads

Unreal-Netiquette

Donate for Oldunreal:
Donate

borderline

Links to our wiki:
Wiki

Walkthrough

Links

Tutorials

Unreal Reference

Usermaps

borderline

Contact us:
Submit News
Page Index Toggle Pages: 1 Send TopicPrint
Normal Topic Speech volume control: possible approaches to distinguish speech from other sounds (Read 40 times)
Masterkent
Developer Team
Offline



Posts: 890
Location: Russia
Joined: Apr 5th, 2013
Gender: Male
Speech volume control: possible approaches to distinguish speech from other sounds
Nov 25th, 2017 at 9:46pm
Print Post  
There are several possible ways to distinguish speech sounds (which should be altered by the speech volume control) from non-speech sounds (which should not be altered by the speech volume control). Let's compare some notable variants.

1) Status quo: a sound is handled as a speech if and only if it's played on slot SLOT_Talk.

2) New PlaySound-like function (e.g. PlaySpeechSound or PlayVoiceSound): a sound is handled as a speech if and only if it's played via a special function for playing speech sounds.

3) New parameter in PlaySound: a sound is handled as a speech if and only if it's played via a function call to a PlaySound-like function with the corresponding additional argument.

4) New sound slot (SLOT_Speech or SLOT_Voice) or slots: a sound is handled as a speech if and only if it's played on any special speech slot.

4a) Only one slot is reserved for speech.

4b) Two or more slots are reserved for speech.

5) New actor property (bIsSpeechPlayer or bIsVoicePlayer): a sound is handled as a speech if and only if it's played by an actor whose bIsSpeechPlayer/bIsVoicePlayer is true at the moment of calling the native function responsible for playing the sound (no matter what sound slot is used).

1. Compatibility with old game versions and UT.

Some modders may want to support 227j along with older versions or UT. Hereafter a support of several game versions (including older versions of Unreal 1 and possibly UT) is referred to as a multi-version support.

A multi-version support can be achieved in different ways, and in particular:

- using only the common subset of features shared between all target versions;

- by creating a different set of packages for each target version;

- using some tricks that allow the same package to use new 227 features and remain compatible with other game versions;

- using additional packages for implementing advanced functionality by means of new 227 features.

A multi-version support can be partial: for example, the server-side part of a multiplayer mod may support 227j while its client-side part may support older game versions.

1.1. Source code compatibility.

Source code compatibility implies that the same UScript code can be compiled against Engine.u and/or other standard packages of new 227 and other game versions simultaneously. It may have a practical value when the multi-version support is achieved by creating a different set of packages for each target version where a certain subset of the source code could be shared between different versions for convenience. In general, source code compatibility can be achieved with or without a necessity to write aux methods that would serve as a compatibility layer.

1.1.1. Source code compatibility without aux methods.

1) SLOT_Talk is supported by old versions and UT, its use does not alter source code compatibility.

2) Any code that directly uses new function identifiers like PlaySpeechSound cannot be complied against the original Engine.u of older versions (hacking Engine.u can make successful compiling possible).

3) Any code that passes an additional argument to PlaySound cannot be complied against the original Engine.u of older versions.

4) Any code that directly uses an identifier that designates new enumerator SLOT_Speech/SLOT_Voice cannot be complied against the original Engine.u of older versions, except the case when the identifier is used in the defaultproperties section.

5) Any code that directly uses an identifier that designates new actor property bIsSpeechPlayer/bIsVoicePlayer cannot be complied against the original Engine.u of older versions, except the case when the identifier is used in the defaultproperties section.

1.1.2. Source code compatibility with aux methods.

2) There is no way to call PlaySpeechSound on older versions without using the function identifier.

3) There is no workaround for passing extra arguments to PlaySound (other than hacking Engine.u)

4) The value of SLOT_Speech/SLOT_Voice can be obtained from a string:

Code
Select All
class GameSupport expands Info;

var ESoundSlot VoiceSlot;

simulated event BeginPlay()
{
	InitVoiceSlot();
}

simulated function InitVoiceSlot()
{
	SetPropertyText("VoiceSlot", "SLOT_Voice");
	if (VoiceSlot == SLOT_None)
		VoiceSlot = SLOT_Talk;
	default.VoiceSlot = VoiceSlot;
}

defaultproperties
{
	bAlwaysRelevant=True
	RemoteRole=ROLE_SimulatedProxy
} 


Usage:

Code
Select All
SoundSource.PlaySound(ASound, class'GameSupport'.default.VoiceSlot); 


This workaround allows to compile the same source against unmodified 224 - 227 Engine.u (the mod should ensure that an instance of GameSupport is created before calling PlaySound).

5) bIsSpeechPlayer/bIsVoicePlayer can be set to true via SetPropertyText:

Code
Select All
simulated event BeginPlay()
{
	if (Level.NetMode != NM_DedicatedServer)
		SetPropertyText("bIsVoicePlayer", "true");
} 


This code can be compiled against unmodified 224 - 227 Engine.u and would not produce an erroneous behavior when running on 224 - 226.

1.2. Binary package compatibility.

Binary compatibility implies that the same package compiled against Engine.u and other standard packages of the new game version can be successfully loaded on old game versions (that is, the game doesn't crash or say something like "Can't find object %ObjectName% in package Engine, goodbye" when trying to load the package and anything that contains hard references to it). It's valuable when the same package is supposed to be used on several game versions.

1) A use of SLOT_Talk does not affect binary compatibility.

2) Referencing a final function from Engine.u prevents loading the package when the current Engine.u does not provide the given function. Referencing a non-final function would preserve the possibility of loading the package (compiled with 227j) on older versions. In other words, if PlaySpeechSound is declared as final, referencing it would break binary compatibility; if it's declared as non-final, referencing it would not break binary compatibility.

3) Passing additional arguments to a function does not alter the binary compatibility.

4) Referencing new enum constants (such as SLOT_Speech/SLOT_Voice) does not alter the binary compatibility.

5) Referencing a new actor property (such as bIsSpeechPlayer/bIsVoicePlayer) from Engine.u prevents loading the package when the current Engine.u does not provide the given property; setting the default value in the defaultproperties section does not alter the binary compatibility.

1.3. Functional compatibility: correct execution on older game versions.

If a package containing some UScript code can be successfully loaded under new and old game versions, this does not imply that the same instruction inside the UScript code will work as intended in either case. Some functionality may need writing different branches of code for different subsets of supported game versions. The relevant branch is supposed to be selected in run-time based on the available information about the current game version (such as version number, subversion number, and results of feature tests).

1) SLOT_Talk is known to be supported by any existing audio subsystem, hence no any special actions are required.

2) Calling a non-existing function like Engine.Actor.PlaySpeechSound in run-time would crash the game, therefore the caller side would have to ensure that such a function is never actually called on game versions that do not support it. Example:

Code
Select All
class GameSupport expands Info;

var bool bHasPlaySpeechSound;

simulated event BeginPlay()
{
	default.bHasPlaySpeechSound =
		int(Level.EngineVersion) >= 227 &&
		DynamicLoadObject("Engine.Actor.PlaySpeechSound", class'Object', true) != none;
} 


Code
Select All
if (class'GameSupport'.default.bHasPlaySpeechSound)
	SoundSource.PlaySpeechSound(ASound);
else
	SoundSource.PlaySound(ASound, SLOT_Talk);
 


3) Passing an additional argument to a function (which does not have the corresponding parameter) in run-time would crash the game, therefore the caller side would have to ensure that such a function call is never actually evaluated on game versions that do not support it.

4) An attempt to play a sound on an additional sound slot on old clients could be considered safe only after reading the corresponding C++ source codes responsible for playing sounds. Experiments alone cannot serve a proof that function calls like SoundSource.PlaySound(ASound, SLOT_Voice) are evaluated correctly there: even though they may seem to work fine, the low-level behavior may be erroneous. For example, if sound slots are represented as items of an array, there is a possibility that using an unsupported slot implies accessing the array out of its bounds. Such invalid operations could corrupt the memory and eventually cause random glithes or crash the game, possibly under hardly reproducible conditions.

224, 225, and 226a/final clients are rarely used, but 226b is worth checking. At this moment, I presume that the existing implementations use an array of 8 slots and calculate the actual slot index as Slot % 8 where Slot is the requested value converted to integer. If so, functional compatibility should not be a problem and the remainder of this clause can be ignored.

In the worst case, if a request to play a sound on slot 7 (which would be the zero-based index of SLOT_Speech/SLOT_Voice) can be proven to be an invalid operation under 226b, such attempts should be prevented as described below.

- Replication of PlaySound in a network game should automatically change SLOT_Speech/SLOT_Voice to SLOT_Talk or SLOT_None when the replication target is an old client.

- Calls to ClientPlaySound that might cause an invalid call to PlaySound on older clients should be wrapped:

[In class PlayerPawn]
Code
Select All
function CompatibleClientPlaySound(sound ASound, optional ESoundSlot SlotType)
{
	if (SlotType <= SLOT_Interface ||
		GetClientVersion() >= 227 &&
		GetClientVersion() < 300 &&
		GetClientSubVersion() >= 10)
	{
		ClientPlaySound(ASound, SlotType);
	}
	else
		ClientPlaySound(ASound, SLOT_Talk);
} 


- Authors of mods, for which compatibility with old clients is considered valuable, should use similar run-time switches based on Level.EngineVersion or GetClientVersion.

5) An attempt to modify a value of a non-existing property using a portable way (defaultproperties or SetPropertyText) has no effect.

1.3.1. Replication across network.

A server-side evaluation may invoke PlaySound or initiate a remote procedure call for a function (such as Engine.PlayerPawn.ClientPlaySound) that calls PlaySound client-side. Ideally, we should be able to use similar techniques for playing speech sounds. In particular, SpecialEvent actors should be able to initiate playing a speech sound by means of calling ClientPlaySound.

1) PlaySound, ClientPlaySound, and custom functions can send a request to play a sound on slot SLOT_Talk.

2) Adding a new replicated function PlaySpeechSound/PlayVoiceSound in Engine.Actor may be a problem. As far as I remember, such additions break backward compatibility, and there is no a general workaround for this issue. In some cases, it's possible to replicate a call to a native function as a call to other replicated function (e.g. PlayOwnedSound may call PlaySound), but I don't see an obvious solution in this case, except that PlaySpeechSound/PlayVoiceSound may invoke PlaySound using an additional sound slot - this would make the given alternative similar to 4.

3) I don't know whether adding a new parameter to a final native function can break its replication or not.
Adding a new parameter to ClientPlaySound would break compatibility with existing code that overrides this function in derived classes (we should avoid doing such things).

4) PlaySound, ClientPlaySound, and custom functions can send a request to play a sound on new slot SLOT_Speech/SLOT_Voice.

5) If the actor which is supposed to play a speech sound is known to always have bIsSpeechPlayer/bIsVoicePlayer equal to true (e.g. it can be set as the default value), then PlaySound functions can send a request to play the speech sound on the given actor. A PlayerPawn actor should not be presumed as an actor which may play only speech sounds.

Otherwise, we might need to perform 3 operations client-side (strictly in order):

- set bIsSpeechPlayer/bIsVoicePlayer to true, then
- call a PlaySound-like function, then
- set bIsSpeechPlayer/bIsVoicePlayer to false.

This would require a remote procedure call to a special function. As far as I see, we can't provide it (similarly to PlaySpeechSound).

2. Sound amplification.

A single call to PlaySound with max natively supported volume value may play the sound at an insufficient volume. Currently, in order to achieve a higher volume, we can call PlaySound several times. Every call should either use a distinct slot or SLOT_None. An attempt to play a sound on the same slot other than SLOT_None of the same actor would result either in overriding the previously played sound or skipped playback (depending on bNoOverride).

227j could allow to play speech sounds at a higher volume by changing the volume limit per one invocation of a PlaySound-like function either for speech sounds only or for any sounds. Allowing a higher volume limit only for speech sounds doesn't look good, because sometimes there is a need in amplifying other sounds too and some modders may decide to play a non-speech as a speech in order to achieve the desirable volume.

When the same part of a mod is supposed to be compatible with older game versions or UT and to play speech sounds at similar volume in relation to other sounds under all supported versions (assuming that speech volume control of 227j is disabled), its implementation cannot rely solely on 227j features and has to use the trick with multiple calls to PlaySound. On the other hand, a speech sound can be amplified through multiple calls to PlaySound under all supported versions if the implementation of speech volume control makes this possible.

1) We can't amplify a speech sound by means of multiple calls to PlaySound using only SLOT_Talk.

2) PlaySpeechSound cannot be called by older versions.

3) Passing more arguments than supported by older versions of the function makes a correct execution of the function call impossible under old versions.

4a) We can't amplify a speech sound by means of multiple calls to PlaySound using only SLOT_Speech/SLOT_Voice.

4b) In order to support the trick with multiple calls to PlaySound, we'd have to introduce several speech slots which would not match SLOT_Ambient under old game versions and UT. Matching to SLOT_None (if any) should be documented.

5) bIsSpeechPlayer/bIsVoicePlayer would not alter the possibility to use several calls to PlaySound for amplifying the played sound.

3. Playing two speech sounds by the same actor simultaneously.

Usually a speech should not overlap with another speech, but in some rare cases a modder might want to allow such an overlapping intentionally. Whether two voice messages can be simultaneously played by the client's PlayerPawn actor and altered by the speech volume control, depends on the chosen implementation of the speech volume control.

1) An actor can play only one sound on SLOT_Talk at a time.

2) PlaySpeechSound/PlayVoiceSound could play several sounds simultaneosly.

3) Passing an additional argument to PlaySound would not alter the possibility to play as many sounds as PlaySound could play without that argument.

4a) An actor could play only one sound on the single slot SLOT_Speech/SLOT_Voice at a time.

4b) Several speech slots might allow playing more sounds at a time.

5) bIsSpeechPlayer/bIsVoicePlayer would not alter the possibility to play as many sounds as PlaySound could play without such extensions.

4. Functional compatibility with the existing code base.

Ideally, the speech volume control should be able to alter any speech we want to control and not alter anything else. This ideal is unreachable in practice, because there exist a lot of legacy UScript code (including scripts written by Epic) which does not follow any common convention regarding playing speech and non-speech sounds that would let us reliably distinguish the former and the latter. Hence, the best we can do is to try to find the most practically useful approximation. In particular, it would be good to achieve the following goals:

- false-positive matches of speech should be reduced to a bare miminum (that is, gasp sounds, jump sounds, and other non-speech sounds should not be altered by the speech volume control);

- there should exist an easy and obvious way to set the original balance between speech and non-speech sounds (such as a new checkbox for enabling/disabling the speech volume control in UMenu).

1) Epic and mod writers used SLOT_Talk for playing non-speech sounds, so an association of speech sounds with SLOT_Talk inevitably results in false-positive matches.

2) Except when erroneously used in new mods, PlaySpeechSound can play only speech sounds.

3) Except when erroneously used in new mods, PlaySound with the corresponding additional argument can play only speech sounds.

4) Except when erroneously used in new mods, SLOT_Speech/SLOT_Voice can be used for speech sounds only.

5) Except when erroneously used in new mods, actors whose bIsSpeechPlayer/bIsVoicePlayer is true can play only speech sounds.

The following table summarizes the advantages and disadvantages of the considered approaches:

                                    1) SLOT_Talk    2) PlaySpeechSound/   3) Additional      4) SLOT_Speech/        5) bIsSpeechPlayer/
                                                       PlayVoiceSound        parameter of       SLOT_Voice             bIsVoicePlayer
                                                                             PlaySound

1.1.1 Source code compatibility        yes             no                    no                 no                     no
without using aux methods

1.1.2. Source code compatibility       yes             no                    no                 yes                    yes
with aux methods

1.2. Binary package compatibility      yes             no  (final)           yes                yes                    no  (hard refs)
with old game versions & UT                            yes (non-final)                                                 yes (dynamic refs)

1.3. Correct execution on older        yes             no                    no                 ?                      yes                         
game versions & UT without
run-time switches
(w/o considering amplification)

1.3.1. PlaySound-like replication      yes             no                    ?                  yes                    no  (general case)
support                                                                                                                yes (speech player only)

A request to play a speech sound       yes             no                    no                 yes                    no
can be sent via ClientPlaySound

2. Sound amplification can be          no              no                    no                 no (1 speech slot)     yes
done uniformly for all versions                                                                 yes (>1 speech slots)

3. Several speech sounds can be        no              yes                   yes                no (1 speech slot)     yes
played by the same actor                                                                        yes (>1 speech slots)

4. No false-positive matches of        no              yes                   yes                yes                    yes
speech sounds

At this moment I'd prefer a mix of approaches 4a and 5: a sound is handled as a speech if and only if it's played on slot SLOT_Speech/SLOT_Voice or by an actor whose bIsSpeechPlayer/bIsVoicePlayer is true at the moment of calling the native function responsible for playing the sound. In particular, this combination would provide the most convenient way to modify UT scripts in order to implement a proper support of the speech volume control without regressive changes.

BTW, UT version of ClientPlaySound does not transfer any information that could indicate whether the sound is a speech or not. In my port of Botpack, I had to insert new parameter SlotType to that function besides changing its name. This is a notable example of specific changes in the API which modders may need in order to support the speech volume control (such changes would be unnecessary without supporting a special handling of speech).
  
Back to top
 
IP Logged
 
Page Index Toggle Pages: 1
Send TopicPrint
Bookmarks: del.icio.us Digg Facebook Google Google+ Linked in reddit StumbleUpon Twitter Yahoo