Computer as a source is really not a good idea if you are serious on audio. Adding a dac maybe can change the sound signature but is it an improvement? You are try laptop vs handphone. The result is handphone is better. So you got your answer the source itself which is the laptop is no good for what you need. Just use or handphone or buy a dedicated player
I never like to use handphone to play songs because it drains my battery... Anyway the source is the issue
I have a Z623, I personally find that Z623 satellites needs to be between 1.5 to 2.1 meter apart to get a nice and more realistic soundstage - that being said my room is about 140sq-ft. Being THX certified, i think it articulates the vocals to very lifelike conditions (not silky smooth, sweet and elevated, but more like a speech presentation like how a cinema does it) and really 'centers', 'thickens' and articulates the voice/vocals like hearing a really good speech by an experience presenter. I find that below 1.5m of separation, the vocals and speech are just way too un-natural - too loud in the mix and way too articulated, the pronunciation of each word is over-emphasized. This is good for movies and when you dont have subtitles but i really think it's overly done for music UNLESS:
You are able to get at least 1.5m of distance between the satellites, then you may hear a difference. On DAC, I have 2 entry level DACs, an ODAC V1 and D Zero MK2 (both no longer in production) but the presentation between the both DAC is is really different. The ODAC sounds flat and dry BUT it's timbre and ADSR is done with good lifelike accuracy. ADSR = Attack Decay Sustain Release (all 'natural' instrument have this, but electronic made music can manipulate this to great and unnatural extent)
Ie. ODAC V1 - the electric guitars are lifelike, it can produce the 'sawtooth/distortion effect' of the guitar with what I
personally think is good accuracy ie. the buzzing and "crunchiness' is wonderful, just like a real guitar amp. The rise and fall of each waveform is just, 'accurate', if it is sawtooth wave, it will sound like it, if it is a sine wave, it will sound like it.
» Click to show Spoiler - click again to hide... «


ADSR
Attack
How quickly the sound reaches full volume after the sound is activated (the key is pressed/sting is strummed or picked). For most mechanical instruments, this period is virtually instantaneous. However, for some popular synthesized voices that don't mimic real instruments, this parameter is slowed down. Slow attack is commonly part of sounds called pads.
Decay
How quickly the sound drops to the sustain level after the initial peak.
Sustain
The "constant" volume that the sound takes after decay until the note is released. Note that this parameter specifies a volume level rather than a time period.
Release
How quickly the sound fades when a note ends (the key is released). Often, this time is very short. An example where the release is longer might be a percussion instrument like a glockenspiel, or a piano with the sustain pedal pressed
On the D Zero MK2, it totally cant do electric guitars (they sound really smoothed out and simply another background instrument in the mix)... ODAC for wide and big soundstage & 'lifelike-ness', but it's just what it is, uncolored, flat, transparent (unless your speakers are really colored), DZMK2 for a more deep but narrow soundstage, fun and smooth sound
If you are into 2.1 for music only purpose, give Swan M50W (very colored but in a interesting way) + ODAC a try... Z623 is really just made for movies, BUT IT'S REALLY GOOD AT MOVIES. Try watching Interstellar, the scene when the rocket takes off or the scene where the there is a sudden explosion in a dead silent space, (at high volume), that sudden surge of sound was almost similar to what I experience in GSC THX/Dolby Atmos cinema. Gunshots, L-R stereo effects are really good. So for me - yes DAC and speaker combo does change the sound character but which one that suits you will be dependent on your taste, to me M50W absolutely cannot do movies, it simply cannot present to the listener an extremely quick and sudden change in sound level and position, and the speech is not pronounced or articulated well, you will need subtitles especially in angmoh movies
Also note that THX certified multimedia speakers have 3 very different input sensitivities/impedance for each input (RCA, 3.5mm & 3.5mm AUX), this may cause different frequency response, loudness, harmonics and distortion with each different input for a same output source, good news is you have 3 choices to suit your gear
I'm more particular when it comes to music than watching movies. And I realised I should get a 2.0 rather than 2.1. If 2.0, what would you recommend?