Related
So I've been traipsing through some of the other Qualcomm repositories searching for tidbits that might be of use for N1 kernels. As far as I can tell, none of these changes have been merged in to the aosp or cyanogen kernels, although unfortunately the underlying kernels have diverged and these won't apply cleanly but it should be possible to manually merge with some effort.
First up, this patch reduces AXI (internal bus) speed when the apps CPU is running at lower clock speeds and increases it when at higher clock speeds. Memory would otherwise be the bottleneck for many applications.
https://www.codeaurora.org/gitweb/q...it;h=6caa6d84d8c687b5f66f5b5ea281183eae8947a8
Code:
msm: acpuclock-8x50: Couple CPU freq and AXI freq.
The memory throughput is directly proportional to the AXI frequency. The
AXI freq is coupled with the CPU freq to prevent the memory from being
the bottleneck when the CPU is running at one of its higher frequencies.
This will cause an increase in power consumption when the CPU is running
at higher frequencies, but will give a better performance/power ratio.
This patch adds core support for AXI clock changes.
https://www.codeaurora.org/gitweb/q...it;h=009d997b1439edf1991e181206c74ee3e943787e
Code:
msm: clock: Add SoC/board independent APIs to set/get max AXI frequency.
Some drivers need to set the AXI frequency to the maximum frequency
supported by the board. Add a MSM_AXI_MAX_FREQ_KHZ magic value that
allows them to achieve that in a SoC/board independent manner.
The following two patches drop the AXI speed when the processor is idle but the screen is on, down to the minimum that will allow the framebuffer to keep driving the screen. Claims to save ~20 mA which would be somewhere around 10-20% at least. Unfortunately these will probably be harder to merge in, as the Chromium kernel uses quite a different mdp/framebuffer architecture. Still, worth trying!
https://www.codeaurora.org/gitweb/q...it;h=fdb01a9945f2ac3a4cc76c507ed0abf5dd5cfb57
Code:
msm_fb: Reduce AXI bus frequency to 62 Mhz from 128 Mhz
Reduced AXI bus frequency to 62 Mhx to save power during idle
screen mode/limited sleep mode.
https://www.codeaurora.org/gitweb/q...it;h=dee586811e34cb24dedc1d0587f3e31f1ba656a6
Code:
msm_fb: Reduce AXI bus frequency to 58 Mhz from 64 Mhz.
Reduce AXI bus frequency further to 58 Mhz for lcdc panels to save
~20mA power during idle screen mode/limited sleep mode.
I can't wait to see what Kmobz, Intersect, and pershoot do with this!
interesting stuff...hopefully one of the many great kernel devs can use this info and find a way to get it to be useful on the N1 and save more battery life
For those interested, if you have a look at pm.c, a lot of the logic seems to be already there:
Code:
/* axi 128 screen on, 61mhz screen off */
static void axi_early_suspend(struct early_suspend *handler) {
axi_rate = 0;
clk_set_rate(axi_clk, axi_rate);
}
The only thing that is missing is the setting of the axi rates along with the cpu frequency. However, I would think that it is going to take a special amount of trial and error to get the rates correct for the nexus one, since they are, after all, different pieces of hardware.
We could try to further reduce the frequency of the AXI bus while the screen is off. AOSP currently does 61MHz, right?
coolbho3000 said:
We could try to further reduce the frequency of the AXI bus while the screen is off. AOSP currently does 61MHz, right?
Click to expand...
Click to collapse
It should be possible to go with Code Aurora's 58 I think?
Yes we could reduce it further. I dont know what impact it will have on stability yet.
I have provided a very quick (and very dirty) patch for scaling the axi rates according to cpu frequency. THIS IS NOT INTENDED TO BE USED UNLESS YOU KNOW WHAT YOU ARE DOING.
It is just a preliminary patch which compiles into a zImage. I do not know if it works (i'll leave that to the experts). Please, always (!!), check the patches and code before running the kernel image.
I tried to follow the convention already existing in acpuclock-scorpion as much as possible. One caveat: I do not know what the return value for a successful clk_set_rate is, and I am only assuming based on the code aura sources (pm.c does not use the return value).
Edit: forgot to add that I think it is a little too much for pm.c AND acpuclock-scorpion to be setting the axi value (if it works at all), and might cause stability issues. Maybe someone might try removing the pm.c code and use acpu-clock only, I dont know....
jazzor said:
Yes we could reduce it further. I dont know what impact it will have on stability yet.
I have provided a very quick (and very dirty) patch for scaling the axi rates according to cpu frequency. THIS IS NOT INTENDED TO BE USED UNLESS YOU KNOW WHAT YOU ARE DOING.
It is just a preliminary patch which compiles into a zImage. I do not know if it works (i'll leave that to the experts). Please, always (!!), check the patches and code before running the kernel image.
I tried to follow the convention already existing in acpuclock-scorpion as much as possible. One caveat: I do not know what the return value for a successful clk_set_rate is, and I am only assuming based on the code aura sources (pm.c does not use the return value).
Edit: forgot to add that I think it is a little too much for pm.c AND acpuclock-scorpion to be setting the axi value (if it works at all), and might cause stability issues. Maybe someone might try removing the pm.c code and use acpu-clock only, I dont know....
Click to expand...
Click to collapse
I've made too many changes to acpuclock to apply this directly. I'll take a look at it later though.
jazzor said:
For those interested, if you have a look at pm.c, a lot of the logic seems to be already there:
Code:
/* axi 128 screen on, 61mhz screen off */
static void axi_early_suspend(struct early_suspend *handler) {
axi_rate = 0;
clk_set_rate(axi_clk, axi_rate);
}
Click to expand...
Click to collapse
True, but the comment doesn't agree with the code itself:
Code:
/* axi 128 screen on, 61mhz screen off */
<snip>
static void axi_late_resume(struct early_suspend *handler) {
axi_rate = 128000000;
sleep_axi_rate = 120000000; // <- 120 MHz...
clk_set_rate(axi_clk, axi_rate);
}
I wonder if we could drop the sleep_axi_rate = 120000000 to something much lower?
Also, the OP patches were for dropping the AXI rate when the screen is on but the processor is idling. (And it's not clear what screen resolution the code in the OP patches was for, but if it's for ChromeOS I presume it's higher than our 800x480, so bus bandwidth for the framebuffer shouldn't be a problem). So no, unless I'm missing something most of the logic in these patches isn't there already as you claim. (If I am missing something, please show me where... I believe it would have to be in the MDP driver somewhere).
hugonz said:
True, but the comment doesn't agree with the code itself:
Code:
/* axi 128 screen on, 61mhz screen off */
<snip>
static void axi_late_resume(struct early_suspend *handler) {
axi_rate = 128000000;
sleep_axi_rate = 120000000; // <- 120 MHz...
clk_set_rate(axi_clk, axi_rate);
}
I wonder if we could drop the sleep_axi_rate = 120000000 to something much lower?
Also, the OP patches were for dropping the AXI rate when the screen is on but the processor is idling. (And it's not clear what screen resolution the code in the OP patches was for, but if it's for ChromeOS I presume it's higher than our 800x480, so bus bandwidth for the framebuffer shouldn't be a problem). So no, unless I'm missing something most of the logic in these patches isn't there already as you claim. (If I am missing something, please show me where... I believe it would have to be in the MDP driver somewhere).
Click to expand...
Click to collapse
I might be wrong, but this is how I understand it. axi_early_suspend is called when the nexus is going under suspend mode, which is like suspend to ram. This happens when there are no active tasks running and the screen is off. As the code says, the rate is set to 0 implying it is going to set it as low as the limits allow (assuming that is 61mhz). axi_late_resume is for the case when the nexus is NOT in suspend. There are still 2 states in this mode, one where the machine is being actively used (browsing web) and one where the machine is idle (such as when we are listening to music or any background task that may or may not be using the screen), but the machine cannot suspend to ram due tasks still actively running.
In any case, experimenting with lower values of axi rates should be done to see if it indeed saves battery.
oooh great thread. I will definitely be looking into this stuff when I get back in town! thanks for the links!
https://www.codeaurora.org/gitweb/q...it;h=8a68421a300878d729991c1db609a541910d2c70
Here is another patch (more up to date -> perhaps easier to merge).
I just implemented the AXI frequency tweak for N1 - I noticed it is also present in HTC Desire kernel
I am testing it now - so far it seems stable.
Attached acpuclock_scorpion.c implementing the AXI tweaks. Basically, AXI frequency now scales with CPU frequency.
Ivan Dimkovic said:
I just implemented the AXI frequency tweak for N1 - I noticed it is also present in HTC Desire kernel
I am testing it now - so far it seems stable.
Attached acpuclock_scorpion.c implementing the AXI tweaks. Basically, AXI frequency now scales with CPU frequency.
Click to expand...
Click to collapse
Do you run this along with a custom kernel and just flash in recovery like an update.zip?
I usually use Cyanogen's kernel as base for my experiments - so this should definitely compile with Cyanogen's 2.6.34 kernels. This acpuclk-scorpion.c has lowered voltages and also support for frequencies up to 1190 MHz
Although you will also need to change few call prototypes in acpuclock.h - it is quite trivial just replace old acpuclk_set_rate() and acpuclk_power_collapse() prototypes with these:
enum setrate_reason {
SETRATE_CPUFREQ = 0,
SETRATE_SWFI,
SETRATE_PC,
SETRATE_PC_IDLE,
};
int acpuclk_set_rate(unsigned long rate, enum setrate_reason reason);
unsigned long acpuclk_power_collapse(int from_idle);
Click to expand...
Click to collapse
That should do it.
Markdental said:
Do you run this along with a custom kernel and just flash in recovery like an update.zip?
Click to expand...
Click to collapse
No, this will have to compiled into the kernel, which I'm doing right now. Thank you so much for doing this Ivan!
Nice to see someone finally got this to work! I've been too lax as of late.
Getting AXI scaling to work was pretty easy - makes me wonder why is it not in the AOSP kernel for N1.
It is definitely in the HTC Desire kernel, and I am sure HTC knows what is good for the hardware they built
Ivan Dimkovic said:
Getting AXI scaling to work was pretty easy - makes me wonder why is it not in the AOSP kernel for N1.
It is definitely in the HTC Desire kernel, and I am sure HTC knows what is good for the hardware they built
Click to expand...
Click to collapse
what change to pm.c if any?
pm.c already has everything in place needed for AXI clock switching.
Does anyone know if the computer spoken about here http://www.kickstarter.com/projects...a-supercomputer-for-everyone?ref=home_popular would be able to compile android if it were running linux??
You would need to get all the tools for teh build system running for arm. I'm pretty sure most of it has been done (gcc, python, bash) because there is a ubuntu built for the arm cpu. The specs on that thing even say it will come with ubuntu on it,. I'm not sure if the jdk is done yet for arm.
I think you're gonna hit a wall with 1GB of ram easily. The operating system youre using will probably take up 1/4 to 1/3 of it. Go around and look at the requirements to build projects like firefox and openoffice. Last time I saw it, firefox needed like 3GB of ram for the linker. You can get a huge SD card and use it as swap space, but thats gonna slow down all those 64 cores. Next up is the disk interface. It has usb2, which is capped at 480MB/s [citation needed]. It doesn't benefit you at all that your cpu can build a bunch of source files at once if it gets bottlenecked at reading those source files from and writing the object files to the hard drive.
I say you probably will be able to get it to build android, but it wont be lightning fast, or really even remarkably fast. By the time you buy that thing for $99, and a keyboard, mouse, usb HDD, SD card, HDMI monitor, and whatever else you need to actually use it, you could have bought a "traditional" computer that has SATA and > 1GB of ram.
noneabove said:
Does anyone know if the computer spoken about here http://www.kickstarter.com/projects...a-supercomputer-for-everyone?ref=home_popular would be able to compile android if it were running linux??
Click to expand...
Click to collapse
No, it will not.
Compiling isn't a task suitable for such a parallel computer. Compiling is mostly I/O intense, not CPU intense, hence you would not gain anything here, even if you'd be able to distribute the compiling task to multiple cores, which is by itself not a trivial task if we are talking about more than a handful of cores.
Also, you don't need a project like this to run a parallel super computer. You can run in parallel on modern graphics cards today. E.g. get a NVIDIA GPU and start using CUDA, and you'll get the idea what it's all about.
Parallel supercomputing is more suitable for specific CPU intense task such as FFT, flow analysis, brute forcing crypto, neural nets and such, where you've got a relative limited amount of data in comparison to the amount CPU needed.
As has been said, much return (financial and performance) and less work to implement with CUDA.
example of the outrageous performance of a system CUDA:
with a password cracking software, with a core i5 was 125 000 operations / s ... to enable support Cuda software, has become more than 8 million / s
As the title says,
What is the best compression level for APK??
and what are the advantages and disadvantages of it??
Good question... I wonder it too.
+1
I would love to know this as well as I'm attempting to optimize one of my roms. From what I understand It's all based on how many cores and the quality of the cpu. In compressing the apk you're optimizing it in some way but adding strain to the cpu because it now has the added task of decompressing every time, but if you've got something like a quad core snapdragon 805 it's a pretty easy task for it to do by separating the workload. Really it's mostly beneficial for devices with low-end flash memory , the key here is to have a good or better quality cpu with multiple cores to easy the workload. In a nutshell you're taking weight off the memory and adding it on to the cpu; good for devices like the moto g with sub-par flash memory but 'good' cpu. there's really not a set level of compression that'll work great for all devices obviously, it's mostly a trial and error type situation where you need to find the right shoe that fits for your specific device.
Hello,
i just got my Z5C yesterday and so far i´m more than happy. But there is one issue:
I use the AOSP Full disk encryption on the phone but it seems like the native Qualcomm hardware cryptographic engine doesn´t work well - i benchmarked the internal storage before and after, here are the results:
Before: read ~ 200 MB/s write: ~ 120 MB/s
After: read ~ 120 MB/s write: ~ 120 MB/s
(Benchmarked by A1 SD Bench)
I´m using a FDE on my windows 10 Notebook with an e-drive resulting in like 5% performance loss. The decrease in read-speed on the Z5C is noticable. What do you think, is there something wrong or is this a normal behaviour?
Cheers
I don't know if this helps, but it seems that the Nexus 5X and 6P won't use hardware encryption according to this:
DB> Encryption is software accelerated. Specifically the ARMv8 as part of 64-bit support has a number of instructions that provides better performance than the AES hardware options on the SoC.
Click to expand...
Click to collapse
Source: The Nexus 5X And 6P Have Software-Accelerated Encryption, But The Nexus Team Says It's Better Than Hardware Encryption
So maybe, Sony is following the same path...
Sadly they don't, it seems like the write-speed decrease is just on the same level as the N6 back then. Let's hope that they include the bibs in the kernel by the marshmellow update.
Why would they use Qualcomms own crappy crypto engine, if the standard Cortex-A57 is really fast with AES thanks to NEON and possibly additional, newer optimizations/instructions? AFAIK the latter are supported in newer Linux kernels per default, so there's no need to use additional libraries to enable support or the Qualcomm crypto stuff.
But it would be nice, if someone with actual insight and detailed knowledge about this could say a few words for clarification.
Neither i got insight nor big knowledge, but i benchmarked the system and like 60% loss in reading speed doesn't feels like a optimized kernel either :/
Qualcomm is a no go. On android plaform, only Exynos 7420(not sure about 5xxx series) real get used h/w encry and decry engine and no speed down.
TheEndHK said:
Qualcomm is a no go. On android plaform, only Exynos 7420(not sure about 5xxx series) real get used h/w encry and decry engine and no speed down.
Click to expand...
Click to collapse
That's not only off topic, it's also wrong. The Exynos SoCs don't have a substantially different crypto engine or "better"/"faster" crypto/hashing acceleration via the ARM cores. If anything, the Samsung guys are smart enough to optimize their software so it makes use of the good hardware. This seems to be missing here, but for no obvious reason.
xaps said:
That's not only off topic, it's also wrong. The Exynos SoCs don't have a substantially different crypto engine or "better"/"faster" crypto/hashing acceleration via the ARM cores. If anything, the Samsung guys are smart enough to optimize their software so it makes use of the good hardware. This seems to be missing here, but for no obvious reason.
Click to expand...
Click to collapse
I agreed all ARMv8-A cpu support hardware AES and SHA. Both Exynos 7420 and S810 should also got that ability but it turns out doesn't work on Z5c now which is a fact. I'm sure S6 got it working but not sure about on other S810 phones or might be Qualcomm missing driver support.
TheEndHK said:
Both Exynos 7420 and S810 should also got that ability but it turns out doesn't work on Z5c now which is a fact.
Click to expand...
Click to collapse
Please show us the kernel source code proving that fact.
What you call "fact" is the result of a simple before and after comparison done with a flash memory benchmark app run by one person on one device. To draw the conclusion that the only reason for the shown result is that the Z5(c) can't do HW acceleration of AES or SHA is a bit far-fetched, don't you think?
xaps said:
Please show us the kernel source code proving that fact.
What you call "fact" is the result of a simple before and after comparison done with a flash memory benchmark app run by one person on one device. To draw the conclusion that the only reason for the shown result is that the Z5(c) can't do HW acceleration of AES or SHA is a bit far-fetched, don't you think?
Click to expand...
Click to collapse
I've got a S6 and no slower after encry/decry and we had a thread discussing about it on S6 board.
I don't own a Z5c now bcoz my living place HK not yet started to sell it(I come to here bcoz considering to sell my S6 and Z1c and swap to Z5c later) so I can't test it but according to OP, there is a substantial slow down.
All ARMv8-A should support hardware AES/SHA, it is not just a cached benchmark result on S6. That's real.
A few things to ponder...
This is confusing. I was always under the impression that decryption (reads) are usually a tad bit faster then encryption writes. This at least seems true for TrueCrypt benchmarks. But that may be comparing apples and oranges.
A few thoughts...
In some other thread it was mentioned that the Z5C optimizes RAM usage by doing internal on the fly compression / decompression to make very efficient usage of the RAM. As cryptotext usually is incompressible could this be a source of the slowdown on flash R/W. Could this be a source of the problem (either by actual slowdown or confusing the measurement of the benchmarking tool?)
These days the SSD flash controllers also do transparent compression of data before writing to reduce wear on the flash. If you send a huge ASCII plaintext file into the write queue the write speed will be ridiculously high, if you send incompressible data like video the write speed rate goes way down. This happens on hardware level, not taking any cryptop/decrypto operations on the OS level into account.
Is there maybe a similar function in todays smartphone flash controllers?
Can I ask the OP, in what situations do you notice the slower read rate on the crypted device? Not so long ago when spinning rust disks were still the norm in desktop and laptop computers read rates of 120 MB were totally out of reach. What kind of usage do you have on your smartphone that you actually notice the lag? Is it when loading huge games or PDF files or something similar?
https://www.xda-developers.com/how-to-double-the-wifi-speed-on-your-oneplus-3-3t/
"According to XDA Senior Member dreinulldrei, the WiFi configuration file used in the OnePlus 3 and 3T is the default one provided by Qualcomm. That’s not really an issue in and of itself, but the user discovered that the default configuration disabled channel bonding in the 2.4GHz frequency. 5GHz frequency networks have channel bonding enabled (and if you have access to this frequency, then it is advised you connect to it), but if your router only support the 2.4GHz frequency then this trick may be useful for you.
Enabling channel bonding should theoretically double your wireless throughput (as long as your router supports channel bonding), as the channel width increases from 20MHz to 40MHz. This trick is quite easy to implement, as all one needs to do is modify one line in WCNSS_qcom_cfg.ini (located in /system/etc/wifi)."
I just saw this last night and was going to play with it this morning. Excellent.
---------- Post added at 08:38 AM ---------- Previous post was at 08:37 AM ----------
Also, I'm glad I'm not the only one that's realized that we can kang a bunch of things from the OP3(+) forums as well
Fyi there is a Magisk module that does that as well if you don't like manually editing files.
Just say no to 40 Mhz channels on 2.4 GHz. It was mistake to add it to the standard (which is the viewpoint of most people actually doing wifi stuff -- you won't find many supporters in the openWRT/LEDE camps). The ONLY way it's good is if you live somewhere with basically no one else on the channels. If there are other wifi networks around on 2.4 GHz, all it does is increase the noise floor for everyone, which lowers everyone's potential throughput and makes all devices use more power. Plus, if the router detects even one non-40 MHz compatible device, the standard says it's supposed to drop back to 20 MHz functionality only, even though it's still outputting the extra RF. So then you only have costs without any benefits. It's not worth it -- just get a 5 GHz router.
Makes things somehow worse for me. 70-90Mb/s with it set to 0, 40-70Mb/s with it set to 1. 250-320Mb/s on the 5GHz channel...
Thank you for sharing this information with us.
I was reading through the file and noticed the line
Code:
gTxPowerCap=30
Does it mean by entering higher number we can improve the WiFi signal? I tried 50 and it seems like nothing has happened.
Maybe someone knows what it does.