Gratulujeme vít(azovi - Cj .: Sp / 229/0 na 1.7 / 5.2 / EÚ.

2014-08-27 Thread El
Gratulujeme vít(azovi - Cj .: Sp / 229/0 na 1.7 / 5.2 / EÚ.

Váš e-mail ID práve vyhral € 450,000.00 Euro (Štyristo pät(desiat tisíc. Euro) 
Medzinárodné Uplift charitatívny program. C(.j. Sp / 229/0 na 1.7 / 5.2 / EÚ. 
Lucky Nie 9/11/13/24/40.

Pre viac informácií a kontaktujte postupoch tvrdenie;
STALLION MEGA NÁROK AGENCY
Pán Juan Carlos.
E-mail: infosta...@aol.com
Tel: + 34-632 662 036(hovorí anglicky a španielsky)

Celé meno, adresu, vek, povolanie, telefónne c(ísla

Poslat( odpoved( na tento e-mail: infosta...@aol.com
Gratulujeme.
S pozdravom,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


9/11/13/24/43 a

2014-08-19 Thread El

POSLEDNÁ OZNÁMENIE
VÁŠ EMAIL ID získal (450,000.00 EUR) v španielskom "El Gordo"
International E-mail lotérie ocenenie sa štastné císla 9/11/13/24/43 a
Ref: ES/9420X2/68.
Pre objasnenie a riadenie Kontakt:
STALLION MEGA NÁROK AGENCY
Pán, Juan Carlos

E-mail, infosta...@aol.com
Tel :0034-632 662 036 (hovorí anglicky a španielsky)
S vašou plné názvy, adresa, vek, povolanie, Telefónne císla,
Poslat odpoved na tento e-mail: infosta...@aol.com
Gratulujeme.
S pozdravom,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ref: ES/9420X2/68.

2014-08-08 Thread El

POSLEDNÁ OZNÁMENIE
VÁŠ EMAIL ID získal (450,000.00 EUR) v španielskom "El Gordo"
International E-mail lotérie ocenenie sa štastné císla 9/11/13/24/43 a
Ref: ES/9420X2/68.
Pre objasnenie a riadenie Kontakt:
STALLION MEGA NÁROK AGENCY
Pán, Juan Carlos

E-mail, infosta...@aol.com
Tel :0034-632 662 036 (hovorí anglicky a španielsky)
S vašou plné názvy, adresa, vek, povolanie, Telefónne císla,
Poslat odpoved na tento e-mail: infosta...@aol.com
Gratulujeme.
S pozdravom,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


POSLEDNÁ OZNÁMENIE

2014-08-13 Thread El

POSLEDNÁ OZNÁMENIE
VÁŠ EMAIL ID získal (450,000.00 EUR) v španielskom "El Gordo"
International E-mail lotérie ocenenie sa štastné císla 9/11/13/24/43 a
Ref: ES/9420X2/68.
Pre objasnenie a riadenie Kontakt:
STALLION MEGA NÁROK AGENCY
Pán, Juan Carlos

E-mail, infosta...@aol.com
Tel :0034-632 662 036 (hovorí anglicky a španielsky)
S vašou plné názvy, adresa, vek, povolanie, Telefónne císla,
Poslat odpoved na tento e-mail: infosta...@aol.com
Gratulujeme.
S pozdravom,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: The 10ms averager in fair.c + granularity

2012-10-04 Thread el es
Hello, Uwaysi,

Uwaysi Bin Kareem  paradoxuncreated.com> writes:

> 
> Ok at 100hz, granularity seems to work as expected. Actually 1000hz for  
> desktop seems to be a myth. I have less jitter with 100hz. Very nice. I  
> think jitter is 99.99% eliminated from doom 3 now.
> 
> Peace Be With You!
> 

I think for some real credibility you'd have to come up with a 
(synthetic) benchmark that clearly demonstrates this problem...

/This/ seems to be the /real/ problem here: how do you measure generic
jitter? In hard figures, not just 'theoretically'.

And no, the subjective 'I can feel it' doesn't work, as everybody
has a different perception of 'jitter', especially in opengl

Maybe try to match the 'filter' avg timing to your gfx screen refresh rate?
(15ms for 60Hz, you get the idea?) and tell us how that works?

Lukasz



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


The uncatchable jitter, or may the scheduler wars be over?

2012-10-05 Thread el es
Hello,

first of all, the posts that inspired me to write this up,
were from Uwaysi Bin Kareem (paradoxuncreated dot com).

Here is what I think:
could the source of graphic/video jitter as most people 
perceive it, be something that could be technically defined 
as 'graphic buffer underrun', caused by the scheduler 
unable to align the deadline for some userspace programs
that are crucial to video/opengl output v-refresh, that 
being really HARD RT ? As in, say the scheduler could 
sometimes decide to preempt the userspace in the middle of
OpenGL/fb call [pretty easy to imagine this : userspace that
often blocks on calls to the video hardware, or has a
usespace thread that does that, and is unable to finish
some opengl pipeline calls before end of its slice, or
in case of misalignment, can execute enough commands to
create one (or several) frame(s), and then is cut in the 
middle of creating another one and has to wait for its 
turn again, and in the mean time, vsync/video buffer swap 
occurs, and that last frame is lost/discarded/created with
time settings from previous slice which are wrong]

Bearing in mind, that the most length the video/fb/opengl
buffer can have, is probably 3 (triple buffering as in 
some game settings), as opposed to (at least some) 
sound hw which can have buffers several ms long,
it's not hard to imagine what happens if userspace cannot
make it in time to update the buffer, causing 'underruns'.

This would also explain why it doesn't matter to 'server'
people - they don't have RT video hw/buffers they care for...
(but they tune the below for max throughput instead)

But whether it is measurable or not - I don't know.

The OP (Uwaysi) has been fiddling with HZ value and the
averaging period of the scheduler (which he called 'filter')
(and granularity too). He's had some interesting results IMO.

Hope the above makes sense and not much gibberish :)

Lukasz

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Stupid user with user-space questions, matrix LED driving with user space code only.

2013-02-18 Thread el es
Jonathan Andrews  jonshouse.co.uk> writes:


> 
> What about a yield alignment mechanism for user space. IE the process
> calls the kernel with a request "schedule me first after a yeild" - then
> the process at least has whatever the timer granularity is to do
> something timing critical... add a flag to ignore or defer interrupts
> and you have a semi 'hard-realtime' behaviour for user space, allowing
> user space to grab small chunks of real time. Yes a nasty looking
> facility for SMP intel servers but really useful for embedded.
> 

Seems you have some (bad?) habits from embedded programming, 
you think Linux is FreeRTOS ;)

Linux as such, as far as I read, is not a real-time OS, it will 
NOT do what you want in userspace, (maybe unless you build it with
the RT patchset?)

Better take the advice and go build a kernel driver for this display.

Or use a small microcontroller that won't have the limitations. 

> Thanks,
> Jon
> 
> 
Lukasz



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


pcmcia-cd-3.1.19

2000-08-30 Thread Ibrahim El-Shafei

Hi,
When I tried to install the pcmcia-cs-3.1.19, I got a message that I
attached it with this E-Mail, so I stopped the installation until I find the
answer.

This may help you:
I live in Egypt/CAIRO

thanks for help.

_/\_/\_
  / 0 ! O \
0| <___> |0
  \___/

 pcmcia-error


pcmcia-cs-3.1.19 installation problem

2000-08-30 Thread Ibrahim El-Shafei

Hi,
When I tried to install the pcmcia-cs-3.1.19, I got a message that I
attached it with this E-Mail, so I stopped the installation until I find the
answer.

This may help you:
I live in Egypt/CAIRO

thanks for help.

_/\_/\_
  / 0 ! O \
0| <___> |0
  \___/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



pcmcia-cs-3.1.19 installation problem

2000-08-30 Thread Ibrahim El-Shafei

Hi,
When I tried to install the pcmcia-cs-3.1.19, I got a message that I
attached it with this E-Mail, so I stopped the installation until I find the
answer.

This may help you:
I live in Egypt/CAIRO

thanks for help.

_/\_/\_
  / 0 ! O \
0| <___> |0
  \___/



 pcmcia-error


Re: pcmcia-cd-3.1.19

2000-08-31 Thread Ibrahim El-Shafei

Thanks for your reply.

> On Thu, 31 Aug 2000, Ibrahim El-Shafei wrote:
>
> > Hi,
> > When I tried to install the pcmcia-cs-3.1.19, I got a message that I
> > attached it with this E-Mail, so I stopped the installation until I find
the
> > answer.
>
> Date doesn't like that no timezone is set. Try setting one.
>
> man date might be helpful.


but the time and date are set correctly and I don't know why I get this
message. can I ignore the message?

>
>
> > This may help you:
> > I live in Egypt/CAIRO
> >
> > thanks for help.
>
>
> Igmar
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0test7 panics on boot

2000-09-02 Thread Ahmed El-Mahmoudy

On Fri, 1 Sep 2000, Stephen Lee wrote:

> Felix von Leitner <[EMAIL PROTECTED]> wrote:
> 
> > The last non-panic message on screen is:
> > 
> >   IPv6 v0.8 for NET4.0
> > 
> > The panic reason is "attempting to kill init".
> > Has anyone else had this problem?
> 
> I have the same problem if I have ipv6 compiled in.  Sorry, I'll
> post the oops messages when I get my hands on it.
> 

yeah, that is fixed in test8-pre1

> Stephen
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

-- 

 Ahmed El-Mahmoudy

  E-mail reply address : [EMAIL PROTECTED]
  Web : http://members.muslimsites.com/aelmahmoudy/
  PGP signature: http://members.muslimsites.com/aelmahmoudy/mypgp.key
  Snail mail : P.O. Box 10 Saray Elkobba,
   Cairo ,Egypt.
   Postal Code 11712

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Installing kernel-2.4.test6

2000-09-08 Thread Ibrahim El-Shafei

Hi all,
I got this error when I tried to 'make bzImage' or 'make install' ...etc
the error is attached with this message and the makefile also attached

thankx for your help

Yours,
Ibrahim El-Shafei

_/\_/\_
  / 0 ! O \
0| <___> |0
  \___/

 Makefile

gcc -D__KERNEL__ -I/tmp/linux/include -Wall -Wstrict-prototypes -O2 
-fomit-frame-pointer -pipe -fno-strength-reduce  -c -o init/main.o init/main.c
{standard input}: Assembler messages:
{standard input}:357: Error: no such 386 instruction: `ldmxcsr'
{standard input}:363: Error: no such 386 instruction: `movups'
{standard input}:364: Error: no such 386 instruction: `movups'
{standard input}:365: Error: no such 386 instruction: `divps'
{standard input}:375: Error: no such 386 instruction: `ldmxcsr'
make: *** [init/main.o] Error 1




Installing Kernel-2.4.test6 problem

2000-09-10 Thread Ibrahim El-Shafei

Hi all,
I got this error when I tried to 'make bzImage' or 'make install' ...etc
the error is attached with this message and the makefile also attached

thanx for your help

Yours,
Ibrahim El-Shafei

_/\_/\_
  / 0 ! O \
0| <___> |0
  \___/

 Makefile

emd.c: In function `umsdos_emd_dir_readentry':
emd.c:145: invalid operands to binary -
emd.c: In function `umsdos_writeentry':
emd.c:264: invalid operands to binary -
emd.c:264: invalid operands to binary -
emd.c:264: invalid operands to binary -
make[3]: *** [emd.o] Error 1
make[3]: Leaving directory `/tmp/linux/fs/umsdos'
make[2]: *** [first_rule] Error 2
make[2]: Leaving directory `/tmp/linux/fs/umsdos'
make[1]: *** [_subdir_umsdos] Error 2
make[1]: Leaving directory `/tmp/linux/fs'
make: *** [_dir_fs] Error 2




Re: xfs internal error on a new filesystem

2007-02-15 Thread Ahmed El Zein


David Chinner <[EMAIL PROTECTED]> wrote on 15 Feb 2007, 11:16 AM:
Subject: Re: xfs internal error on a new filesystem
>On Wed, Feb 14, 2007 at 10:24:27AM +, Ramy M. Hassan  wrote:
>> Hello,
>> We got the following xfs internal error on one of our production servers:
>> 
>> Feb 14 08:28:52 info6 kernel: [238186.676483] Filesystem "sdd8": XFS
>> internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c. 
>> Caller 0xf8b906e7
>
>Real stack looks to be:
>
> xfs_trans_cancel
> xfs_mkdir
> xfs_vn_mknod
> xfs_vn_mkdir
> vfs_mkdir
> sys_mkdirat
> sys_mkdir
>
>We aborted a transaction for some reason. We got an error somewhere in
>a mkdir while we had a dirty transaction.  Unfortunately, this tells us
>very
>little about the error that actually caused the shutdown.
>
>What is your filessytem layout? (xfs_info ) How much memory
>do you have and were you near enomem conditions?

We have 1536 MB of ram. It is possible that at the time of the crash we
were near enomem conditions, I don;t know for sure but we have seen such
spikes on our servers.

[EMAIL PROTECTED]:~# xfs_info /vol/6/
meta-data=/dev/sdd8  isize=256agcount=16, agsize=7001584
blks
 =   sectsz=512   attr=0
data =   bsize=4096   blocks=112025248, imaxpct=25
 =   sunit=16 swidth=64 blks, unwritten=0
naming   =version 2  bsize=4096  
log  =internal   bsize=4096   blocks=32768, version=1
 =   sectsz=512   sunit=0 blks
realtime =none   extsz=65536  blocks=0, rtextents=0


>
>> We were able to unmount/remount the volume (didn't do xfs_repair because
>we
>> thought it might take long time, and the server was already in production
>> at the moement)
>
>Risky to run a production system on a filesystem that might be corrupted.
>You risk further problems if you don't run repair
>
>> The file system was created less than 48hours ago, and 370G of sensitve
>> production data was moved to the server before it xfs crash.
>
>So that's not a "new" filesystem at all...
By new we meant 48 hours old.

>
>FWIW, did you do any offline testing before you put it into production?

We did some basic testing. But as a filesystem developer, how would you
test a filesystem so that you would be comfortable with the stability of 
the filesystem and be worry free in terms of faulty hardware? 

>
>> System details :
>> Kernel: 2.6.18
>> Controller: 3ware 9550SX-8LP (RAID 10)
>
>Can you describe your dm/md volume layout?

one unit, 8HDDs, a stripe of 4 mirrors.

>
>> We are wondering here if this problem is an indicator to data corruption
>on
>> disk ?
>
>It might be. You didn't run xfs_check or xfs_repair, so we don't know if
>there is any on disk corruption here.
>
>> is it really necessary to run xfs_repair ?
>
>If you want to know if you haven't left any landmines around for the
>filesystem to trip over again. i.e. You should run repair after any
>sort of XFS shutdown to make sure nothing is corrupted on disk.
>If nothing is corrupted on disk, then we are looking at an in-memory
>problem
we will run repair tonight.

>
>> Do u recommend that we switch back to reiserfs ?
>
>Not yet.
>
>> Could it be a hardware related problems  ?
>
>Yes. Do you have ECC memory on your server? Have you run memtest86?
>Were there any I/O errors in the log prior to the shutdown message?
Yes, we have ECC memory.
We will try to run memtest86 as soon as possible.
There were no I/O errors in the log prior to the shutdown message.

Btw, this is a vmware image. /vol/6 is an exported physical partition.

>Cheers,
>
>Dave.
>-- 
>Dave Chinner
>Principal Engineer
>SGI Australian Software Group
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-07 Thread Abdellatif El Khlifi
Hi Mathieu,

> > +   do {
> > +   state_reg = readl(priv->reset_cfg.state_reg);
> > +   *rst_ack = EXTSYS_RST_ST_RST_ACK(state_reg);
> > +
> > +   if (*rst_ack == EXTSYS_RST_ACK_RESERVED) {
> > +   dev_err(dev, "unexpected RST_ACK value: 0x%x\n",
> > +   *rst_ack);
> > +   return -EINVAL;
> > +   }
> > +
> > +   /* expected ACK value read */
> > +   if ((*rst_ack & exp_ack) || (*rst_ack == exp_ack))
> 
> I'm not sure why the second condition in this if() statement is needed.  As 
> far
> as I can tell the first condition will trigger and the second one won't be
> reached.

The second condition takes care of the following: exp_ack and  *rst_ack are 
both 0.
This case happens when RST_REQ bit is cleared (meaning: No reset requested) and
we expect the RST_ACK to be 00 afterwards.

> > +/**
> > + * arm_rproc_load() - Load firmware to memory function for rproc_ops
> > + * @rproc: pointer to the remote processor object
> > + * @fw: pointer to the firmware
> > + *
> > + * Does nothing currently.
> > + *
> > + * Return:
> > + *
> > + * 0 for success.
> > + */
> > +static int arm_rproc_load(struct rproc *rproc, const struct firmware *fw)
> > +{
> 
> What is the point of doing rproc_of_parse_firmware() if the firmware image is
> not loaded to memory?  Does the remote processor have some kind of default ROM
> image to run if it doesn't find anything in memory?

Yes, the remote processor has a default FW image already loaded by default.

rproc_boot() [1] and _request_firmware() [2] fail if there is no FW file in the 
filesystem or a filename
provided.

Please correct me if I'm wrong.

[1]: 
https://elixir.bootlin.com/linux/v6.8-rc7/source/drivers/remoteproc/remoteproc_core.c#L1947
[2]: 
https://elixir.bootlin.com/linux/v6.8-rc7/source/drivers/base/firmware_loader/main.c#L863

> > +module_platform_driver(arm_rproc_driver);
> > +
> 
> I am echoing Krzysztof view about how generic this driver name is.  This has 
> to
> be related to a family of processors or be made less generic in some way.  
> Have
> a look at what TI did for their K3 lineup [1] - I would like to see the same
> thing done here.

Thank you, I'll take care of that and of all the other comments made.

Cheers,
Abdellatif



Re: [PATCH 3/3] dt-bindings: remoteproc: Add Arm remoteproc

2024-03-08 Thread Abdellatif El Khlifi
Hi Krzysztof, Sudeep,

> > > diff --git a/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml 
> > > b/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
> > > new file mode 100644
> > > index ..322197158059
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/remoteproc/arm,rproc.yaml
> > > @@ -0,0 +1,69 @@
> > > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > > +%YAML 1.2
> > > +---
> > > +$id: http://devicetree.org/schemas/remoteproc/arm,rproc.yaml#
> > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > +
> > > +title: Arm Remoteproc Devices
> > 
> > That's quite generic... does it applied to all ARM designs?
> > 
> 
> Nope, it is platform specific. It can't just generically be referred as
> Arm Remoteproc for sure.

Thank you guys.

The file names and the documentation will reflect that it's
an Arm Corstone SoC. Work in progress.

Cheers,
Abdellatif



Re: [PATCH 2/3] arm64: dts: Add corstone1000 external system device node

2024-03-08 Thread Abdellatif El Khlifi
Hi Sudeep,

> > +   extsys0: remoteproc@1a010310 {
> > +   compatible = "arm,corstone1000-extsys";
> > +   reg = <0x1a010310 0x4>,
> > +   <0x1a010314 0X4>;
> 
> 
> As per [1], this is just a few registers within the 64kB block.
> Not sure if it should be represented as a whole on just couple
> of registers like this for reset.
> 
> [1] 
> https://developer.arm.com/documentation/101418/0100/Programmers-model/Register-descriptions/Host-Base-System-Control-register-summary

The Host Base System Control registers are not specific to the External System 
processors. They are various registers with different purposes.

Only 4 registers matter for the remoteproc feature:

- The External system 0 reset control and status registers: 
EXT_SYS0_RST_CTRL, EXT_SYS0_RST_ST
- Same for the the External system 1: EXT_SYS1_RST_CTRL, EXT_SYS1_RST_ST

So, mapping the whole Host Base System Control area doesn't make sense for the 
remoteproc feature
and exposes registers that are not related to the External Systems to the 
driver.

By the way, the latest document we are referring to is [1].

[1]: 
https://developer.arm.com/documentation/102342//Programmers-model/Register-descriptions/Host-Base-System-Control-register-summary

Cheers,
Abdellatif



Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-11 Thread Abdellatif El Khlifi
Hi Mathieu,

On Fri, Mar 08, 2024 at 09:44:26AM -0700, Mathieu Poirier wrote:
> On Thu, 7 Mar 2024 at 12:40, Abdellatif El Khlifi
>  wrote:
> >
> > Hi Mathieu,
> >
> > > > +   do {
> > > > +   state_reg = readl(priv->reset_cfg.state_reg);
> > > > +   *rst_ack = EXTSYS_RST_ST_RST_ACK(state_reg);
> > > > +
> > > > +   if (*rst_ack == EXTSYS_RST_ACK_RESERVED) {
> > > > +   dev_err(dev, "unexpected RST_ACK value: 0x%x\n",
> > > > +   *rst_ack);
> > > > +   return -EINVAL;
> > > > +   }
> > > > +
> > > > +   /* expected ACK value read */
> > > > +   if ((*rst_ack & exp_ack) || (*rst_ack == exp_ack))
> > >
> > > I'm not sure why the second condition in this if() statement is needed.  
> > > As far
> > > as I can tell the first condition will trigger and the second one won't be
> > > reached.
> >
> > The second condition takes care of the following: exp_ack and  *rst_ack are 
> > both 0.
> > This case happens when RST_REQ bit is cleared (meaning: No reset requested) 
> > and
> > we expect the RST_ACK to be 00 afterwards.
> >
> 
> This is the kind of conditions that definitely deserve documentation.
> Please split the conditions in two different if() statements and add a
> comment to explain what is going on.

Thanks, I'll address that.

> 
> > > > +/**
> > > > + * arm_rproc_load() - Load firmware to memory function for rproc_ops
> > > > + * @rproc: pointer to the remote processor object
> > > > + * @fw: pointer to the firmware
> > > > + *
> > > > + * Does nothing currently.
> > > > + *
> > > > + * Return:
> > > > + *
> > > > + * 0 for success.
> > > > + */
> > > > +static int arm_rproc_load(struct rproc *rproc, const struct firmware 
> > > > *fw)
> > > > +{
> > >
> > > What is the point of doing rproc_of_parse_firmware() if the firmware 
> > > image is
> > > not loaded to memory?  Does the remote processor have some kind of 
> > > default ROM
> > > image to run if it doesn't find anything in memory?
> >
> > Yes, the remote processor has a default FW image already loaded by default.
> >
> 
> That too would have mandated a comment - otherwise people looking at
> the code are left wondering, as I did.
> 
> > rproc_boot() [1] and _request_firmware() [2] fail if there is no FW file in 
> > the filesystem or a filename
> > provided.
> >
> > Please correct me if I'm wrong.
> 
> You are correct, the remoteproc subsystem expects a firmware image to
> be provided _and_ loaded into memory.  Providing a dummy image just to
> get the remote processor booted is a hack, but simply because the
> subsystem isn't tailored to handle this use case.  So I am left
> wondering what the plans are for this driver, i.e is this a real
> scenario that needs to be addressed or just an initial patchset to get
> a foundation for the driver.
> 
> In the former case we need to start talking about refactoring the
> subsystem so that it properly handles remote processors that don't
> need a firmware image.  In the latter case I'd rather see a patchset
> where the firmware image is loaded into RAM.

This is an initial patchset for allowing to turn on and off the remote 
processor.
The FW is already loaded before the Corstone-1000 SoC is powered on and this
is done through the FPGA board bootloader in case of the FPGA target. Or by the 
Corstone-1000 FVP model
(emulator).

The plan for the driver is as follows:

Step 1: provide a foundation driver capable of turning the core on/off

Step 2: provide mailbox support for comms

Step 3: provide FW reload capability

Steps 2 & 3 are waiting for a HW update so the Cortex-A35 (running Linux) can 
share memory with
the remote core.

I'm happy to provide more explanation in the commit log to reflect this status.

Is it OK that we go with step 1 as a foundation please ?

Cheers
Abdellatif



Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-12 Thread Abdellatif El Khlifi
Hi Mathieu,

On Tue, Mar 12, 2024 at 10:29:52AM -0600, Mathieu Poirier wrote:
> > This is an initial patchset for allowing to turn on and off the remote 
> > processor.
> > The FW is already loaded before the Corstone-1000 SoC is powered on and this
> > is done through the FPGA board bootloader in case of the FPGA target. Or by 
> > the Corstone-1000 FVP model
> > (emulator).
> >
> >From the above I take it that booting with a preloaded firmware is a
> scenario that needs to be supported and not just a temporary stage.

The current status of the Corstone-1000 SoC requires that there is
a preloaded firmware for the external core. Preloading is done externally
either through the FPGA bootloader or the emulator (FVP) before powering
on the SoC.

Corstone-1000 will be upgraded in a way that the A core running Linux is able
to share memory with the remote core and also being able to access the remote
core memory so Linux can copy the firmware to. This HW changes are still
under development.

This is why this patchset is relying on a preloaded firmware. And it's the step 
1
of adding remoteproc support for Corstone.

When the HW is ready, we will be able to avoid preloading the firmware
and the user can do the following:

1) Use a default firmware filename stated in the DT (firmware-name property),
that's the one remoteproc subsystem will use initially, load the firmware file
and start the remote core.

2) Then, the user can choose to use another firmware file:

echo stop >/sys/class/remoteproc/remoteproc0/state
echo -n new_firmware.elf > /sys/class/remoteproc/remoteproc0/firmware
echo start >/sys/class/remoteproc/remoteproc0/state

> > The plan for the driver is as follows:
> >
> > Step 1: provide a foundation driver capable of turning the core on/off
> >
> > Step 2: provide mailbox support for comms
> >
> > Step 3: provide FW reload capability
> >
> What happens when a user wants to boot the remote processor with the
> firmware provided on the file system rather than the one preloaded
> into memory?

We will support this scenario when the HW is upgraded and copying the firmware
to the remote core memory becomes possible.

> Furthermore, how do we account for scenarios where the
> remote processor goes from running a firmware image on the file system
> to the firmware image loaded by an external entity?  Is this a valid
> scenario?

No, this scenario won't apply when we get the HW upgrade. No need for an
external entity anymore. The firmware(s) will all be files in the linux 
filesystem.

> > Steps 2 & 3 are waiting for a HW update so the Cortex-A35 (running Linux) 
> > can share memory with
> > the remote core.
> >
> > I'm happy to provide more explanation in the commit log to reflect this 
> > status.
> >
> > Is it OK that we go with step 1 as a foundation please ?
> >
> 
> First let's clarify all the scenarios that need to be supported.  From
> there I will advise on how to proceed and what modifications to the
> subsystem's core should be made, if need be.

Thanks, I hope the answers above provide the information needed.

Cheers
Abdellatif



Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-13 Thread Abdellatif El Khlifi
Hi Mathieu,

On Wed, Mar 13, 2024 at 10:25:32AM -0600, Mathieu Poirier wrote:
> On Tue, Mar 12, 2024 at 05:32:52PM +, Abdellatif El Khlifi wrote:
> > Hi Mathieu,
> > 
> > On Tue, Mar 12, 2024 at 10:29:52AM -0600, Mathieu Poirier wrote:
> > > > This is an initial patchset for allowing to turn on and off the remote 
> > > > processor.
> > > > The FW is already loaded before the Corstone-1000 SoC is powered on and 
> > > > this
> > > > is done through the FPGA board bootloader in case of the FPGA target. 
> > > > Or by the Corstone-1000 FVP model
> > > > (emulator).
> > > >
> > > >From the above I take it that booting with a preloaded firmware is a
> > > scenario that needs to be supported and not just a temporary stage.
> > 
> > The current status of the Corstone-1000 SoC requires that there is
> > a preloaded firmware for the external core. Preloading is done externally
> > either through the FPGA bootloader or the emulator (FVP) before powering
> > on the SoC.
> > 
> 
> Ok
> 
> > Corstone-1000 will be upgraded in a way that the A core running Linux is 
> > able
> > to share memory with the remote core and also being able to access the 
> > remote
> > core memory so Linux can copy the firmware to. This HW changes are still
> > This is why this patchset is relying on a preloaded firmware. And it's the 
> > step 1
> > of adding remoteproc support for Corstone.
> >
> 
> Ok, so there is a HW problem where A core and M core can't see each other's
> memory, preventing the A core from copying the firmware image to the proper
> location.
> 
> When the HW is fixed, will there be a need to support scenarios where the
> firmware image has been preloaded into memory?

No, this scenario won't apply when we get the HW upgrade. No need for an
external entity anymore. The firmware(s) will all be files in the linux 
filesystem.

Cheers
Abdellatif



Re: [PATCH 3/3] dt-bindings: remoteproc: Add Arm remoteproc

2024-03-14 Thread Abdellatif El Khlifi
Hi Robin,

> > +  firmware-name:
> > +description: |
> > +  Default name of the firmware to load to the remote processor.
> 
> So... is loading the firmware image achieved by somehow bitbanging it
> through the one reset register, maybe? I find it hard to believe this is a
> complete and functional binding.
> 
> Frankly at the moment I'd be inclined to say it isn't even a remoteproc
> binding (or driver) at all, it's a reset controller. Bindings are a contract
> for describing the hardware, not the current state of Linux driver support -
> if this thing still needs mailboxes, shared memory, a reset vector register,
> or whatever else to actually be useful, those should be in the binding from
> day 1 so that a) people can write and deploy correct DTs now, such that
> functionality becomes available on their systems as soon as driver support
> catches up, and b) the community has any hope of being able to review
> whether the binding is appropriately designed and specified for the purpose
> it intends to serve.

This is an initial patchset for allowing to turn on and off the remote 
processor.
The FW is already loaded before the Corstone-1000 SoC is powered on and this
is done through the FPGA board bootloader in case of the FPGA target.
Or by the Corstone-1000 FVP model (emulator).

The plan for the driver is as follows:

Step 1: provide a foundation driver capable of turning the core on/off
Step 2: provide mailbox support for comms
Step 3: provide FW reload capability

Steps 2 & 3 are waiting for a HW update so the Cortex-A35 (running Linux) can
share memory with the remote core.

So, when memory sharing becomes available in the FPGA and FVP the
DT binding will be upgraded with:

- mboxes property specifying the RX/TX mailboxes (based on MHU v2)
- memory-region property describing the virtio vrings

Currently the mailbox controller does exist in the HW but is not
usable via virtio (no memory sharing available).

Do you recommend I add the mboxes property even currently we can't do the comms 
?

> For instance right now it seems somewhat tenuous to describe two consecutive
> 32-bit registers as separate "reg" entries, but *maybe* it's OK if that's
> all there ever is. However if it's actually going to end up needing several
> more additional MMIO and/or memory regions for other functionality, then
> describing each register and location individually is liable to get
> unmanageable really fast, and a higher-level functional grouping (e.g. these
> reset-related registers together as a single 8-byte region) would likely be
> a better design.

Currently the HW provides only 2 registers to control the remote processors:

The reset control and status registers.

It makes sense to me to use a mapped region of 8 bytes for both registers rather
than individual registers (since they are consecutive).
I'll update that, thanks for the suggestion.

Abdellatif,
Cheers



Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-14 Thread Abdellatif El Khlifi
Hi Sudeep,

On Thu, Mar 14, 2024 at 02:59:20PM +, Sudeep Holla wrote:
> On Thu, Mar 14, 2024 at 08:52:59AM -0600, Mathieu Poirier wrote:
> > On Wed, Mar 13, 2024 at 05:17:56PM +, Abdellatif El Khlifi wrote:
> > > Hi Mathieu,
> > >
> > > On Wed, Mar 13, 2024 at 10:25:32AM -0600, Mathieu Poirier wrote:
> > > > On Tue, Mar 12, 2024 at 05:32:52PM +, Abdellatif El Khlifi wrote:
> > > > > Hi Mathieu,
> > > > >
> > > > > On Tue, Mar 12, 2024 at 10:29:52AM -0600, Mathieu Poirier wrote:
> > > > > > > This is an initial patchset for allowing to turn on and off the 
> > > > > > > remote processor.
> > > > > > > The FW is already loaded before the Corstone-1000 SoC is powered 
> > > > > > > on and this
> > > > > > > is done through the FPGA board bootloader in case of the FPGA 
> > > > > > > target. Or by the Corstone-1000 FVP model
> > > > > > > (emulator).
> > > > > > >
> > > > > > >From the above I take it that booting with a preloaded firmware is 
> > > > > > >a
> > > > > > scenario that needs to be supported and not just a temporary stage.
> > > > >
> > > > > The current status of the Corstone-1000 SoC requires that there is
> > > > > a preloaded firmware for the external core. Preloading is done 
> > > > > externally
> > > > > either through the FPGA bootloader or the emulator (FVP) before 
> > > > > powering
> > > > > on the SoC.
> > > > >
> > > >
> > > > Ok
> > > >
> > > > > Corstone-1000 will be upgraded in a way that the A core running Linux 
> > > > > is able
> > > > > to share memory with the remote core and also being able to access 
> > > > > the remote
> > > > > core memory so Linux can copy the firmware to. This HW changes are 
> > > > > still
> > > > > This is why this patchset is relying on a preloaded firmware. And 
> > > > > it's the step 1
> > > > > of adding remoteproc support for Corstone.
> > > > >
> > > >
> > > > Ok, so there is a HW problem where A core and M core can't see each 
> > > > other's
> > > > memory, preventing the A core from copying the firmware image to the 
> > > > proper
> > > > location.
> > > >
> > > > When the HW is fixed, will there be a need to support scenarios where 
> > > > the
> > > > firmware image has been preloaded into memory?
> > >
> > > No, this scenario won't apply when we get the HW upgrade. No need for an
> > > external entity anymore. The firmware(s) will all be files in the linux 
> > > filesystem.
> > >
> >
> > Very well.  I am willing to continue with this driver but it does so little 
> > that
> > I wonder if it wouldn't simply be better to move forward with upstreaming 
> > when
> > the HW is fixed.  The choice is yours.
> >
> 
> I think Robin has raised few points that need clarification. I think it was
> done as part of DT binding patch. I share those concerns and I wanted to
> reaching to the same concerns by starting the questions I asked on corstone
> device tree changes.

Please have a look at my answer to Robin with clarifications [1].

Apart from mapping the register area rather than using the reg property
I'll also add the mboxes property as Krzysztof confirmed.

[1]: https://lore.kernel.org/all/20240314134928.ga27...@e130802.arm.com/

Cheers,
Abdellatif



Re: [PATCH 3/3] dt-bindings: remoteproc: Add Arm remoteproc

2024-03-14 Thread Abdellatif El Khlifi
Hi Krzysztof,

On Thu, Mar 14, 2024 at 02:56:53PM +0100, Krzysztof Kozlowski wrote:
> On 14/03/2024 14:49, Abdellatif El Khlifi wrote:
> >> Frankly at the moment I'd be inclined to say it isn't even a remoteproc
> >> binding (or driver) at all, it's a reset controller. Bindings are a 
> >> contract
> >> for describing the hardware, not the current state of Linux driver support 
> >> -
> >> if this thing still needs mailboxes, shared memory, a reset vector 
> >> register,
> >> or whatever else to actually be useful, those should be in the binding from
> >> day 1 so that a) people can write and deploy correct DTs now, such that
> >> functionality becomes available on their systems as soon as driver support
> >> catches up, and b) the community has any hope of being able to review
> >> whether the binding is appropriately designed and specified for the purpose
> >> it intends to serve.
> > 
> > This is an initial patchset for allowing to turn on and off the remote 
> > processor.
> > The FW is already loaded before the Corstone-1000 SoC is powered on and this
> > is done through the FPGA board bootloader in case of the FPGA target.
> > Or by the Corstone-1000 FVP model (emulator).
> > 
> > The plan for the driver is as follows:
> > 
> > Step 1: provide a foundation driver capable of turning the core on/off
> > Step 2: provide mailbox support for comms
> > Step 3: provide FW reload capability
> > 
> > Steps 2 & 3 are waiting for a HW update so the Cortex-A35 (running Linux) 
> > can
> > share memory with the remote core.
> > 
> > So, when memory sharing becomes available in the FPGA and FVP the
> > DT binding will be upgraded with:
> > 
> > - mboxes property specifying the RX/TX mailboxes (based on MHU v2)
> > - memory-region property describing the virtio vrings
> > 
> > Currently the mailbox controller does exist in the HW but is not
> > usable via virtio (no memory sharing available).
> > 
> > Do you recommend I add the mboxes property even currently we can't do the 
> > comms ?
> 
> Bindings should be complete, regardless whether Linux driver supports it
> or not. Please see writing bindings document for explanation on this and
> other rules.
> 
> So yes: please describe as much as possible/reasonable.

I'll do thanks.

Cheers,
Abdellatif



Re: [PATCH 3/3] dt-bindings: remoteproc: Add Arm remoteproc

2024-03-15 Thread Abdellatif El Khlifi
Hi Sudeep,

On Thu, Mar 14, 2024 at 03:19:13PM +, Sudeep Holla wrote:
> > The plan for the driver is as follows:
> >
> > Step 1: provide a foundation driver capable of turning the core on/off
> > Step 2: provide mailbox support for comms
> > Step 3: provide FW reload capability
> >
> > Steps 2 & 3 are waiting for a HW update so the Cortex-A35 (running Linux) 
> > can
> > share memory with the remote core.
> >
> 
> Honestly, I would prefer to know the overall design before pushing any partial
> solution. If you know the final complete solution, present the same with
> the complete device tree binding for better understanding and review.

Sounds good to me. I'll make the binding as complete as possible.

> Agreed, but it is part of a bigger block with other functionality in place.
> MFD/syscon might be better way to use these registers. You never know in
> future you might want to use another set of 2-4 registers with a different
> functionality in another driver.
> 
> > It makes sense to me to use a mapped region of 8 bytes for both registers 
> > rather
> > than individual registers (since they are consecutive).
> 
> Not exactly. Are you sure, Linux will not have to use another other registers
> in that block ? Will you keep creating such (random if I may call it so)
> bindings for a smaller sets of register under "Host Base System Control
> registers".
> 
> I would see if it makes sense to put together a single binding for
> this "Host Base System Control" register(not sure what exactly that means).
> Use MFD/regmap you access parts of this block. The remoteproc driver can
> then be semi-generic(meaning applicable to group of similar platforms)
> based on the platform compatible and use this regmap to provide the
> functionality needed.

I like the idea of using syscon/regmap to represent the "Host Base System 
Control registers"
area. Thank you for suggesting that.

I think syscon is the way to go (rather than MFD). With syscon we can use
the generic syscon driver that converts a set of MMIO registers to a regmap,
allowing it to be accessed from multiple device drivers.
In our case these MMIO registers will be the "Host Base System Control 
registers".

remoteproc will be a child node under sysctrl node.

Cheers,
Abdellatif



Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-25 Thread Abdellatif El Khlifi
Hi Mathieu,

> > > > > > > > This is an initial patchset for allowing to turn on and off the 
> > > > > > > > remote processor.
> > > > > > > > The FW is already loaded before the Corstone-1000 SoC is 
> > > > > > > > powered on and this
> > > > > > > > is done through the FPGA board bootloader in case of the FPGA 
> > > > > > > > target. Or by the Corstone-1000 FVP model
> > > > > > > > (emulator).
> > > > > > > >
> > > > > > > >From the above I take it that booting with a preloaded firmware 
> > > > > > > >is a
> > > > > > > scenario that needs to be supported and not just a temporary 
> > > > > > > stage.
> > > > > >
> > > > > > The current status of the Corstone-1000 SoC requires that there is
> > > > > > a preloaded firmware for the external core. Preloading is done 
> > > > > > externally
> > > > > > either through the FPGA bootloader or the emulator (FVP) before 
> > > > > > powering
> > > > > > on the SoC.
> > > > > >
> > > > >
> > > > > Ok
> > > > >
> > > > > > Corstone-1000 will be upgraded in a way that the A core running 
> > > > > > Linux is able
> > > > > > to share memory with the remote core and also being able to access 
> > > > > > the remote
> > > > > > core memory so Linux can copy the firmware to. This HW changes are 
> > > > > > still
> > > > > > This is why this patchset is relying on a preloaded firmware. And 
> > > > > > it's the step 1
> > > > > > of adding remoteproc support for Corstone.
> > > > > >
> > > > >
> > > > > Ok, so there is a HW problem where A core and M core can't see each 
> > > > > other's
> > > > > memory, preventing the A core from copying the firmware image to the 
> > > > > proper
> > > > > location.
> > > > >
> > > > > When the HW is fixed, will there be a need to support scenarios where 
> > > > > the
> > > > > firmware image has been preloaded into memory?
> > > >
> > > > No, this scenario won't apply when we get the HW upgrade. No need for an
> > > > external entity anymore. The firmware(s) will all be files in the linux 
> > > > filesystem.
> > > >
> > >
> > > Very well.  I am willing to continue with this driver but it does so 
> > > little that
> > > I wonder if it wouldn't simply be better to move forward with upstreaming 
> > > when
> > > the HW is fixed.  The choice is yours.
> > >
> >
> > I think Robin has raised few points that need clarification. I think it was
> > done as part of DT binding patch. I share those concerns and I wanted to
> > reaching to the same concerns by starting the questions I asked on corstone
> > device tree changes.
> >
> 
> I also agree with Robin's point of view.  Proceeding with an initial
> driver with minimal functionality doesn't preclude having complete
> bindings.  But that said and as I pointed out, it might be better to
> wait for the HW to be fixed before moving forward.

We checked with the HW teams. The missing features will be implemented but
this will take time.

The foundation driver as it is right now is still valuable for people wanting to
know how to power control Corstone external systems in a future proof manner
(even in the incomplete state). We prefer to address all the review comments
made so it can be merged. This includes making the DT binding as complete as
possible as you advised. Then, once the HW is ready, I'll implement the comms
and the FW reload part. Is that OK please ?

Cheers,
Abdellatif



Re: [PATCH 1/3] remoteproc: Add Arm remoteproc driver

2024-03-26 Thread Abdellatif El Khlifi
Hi Mathieu,

> > > > > > > > > > This is an initial patchset for allowing to turn on and off 
> > > > > > > > > > the remote processor.
> > > > > > > > > > The FW is already loaded before the Corstone-1000 SoC is 
> > > > > > > > > > powered on and this
> > > > > > > > > > is done through the FPGA board bootloader in case of the 
> > > > > > > > > > FPGA target. Or by the Corstone-1000 FVP model
> > > > > > > > > > (emulator).
> > > > > > > > > >
> > > > > > > > > >From the above I take it that booting with a preloaded 
> > > > > > > > > >firmware is a
> > > > > > > > > scenario that needs to be supported and not just a temporary 
> > > > > > > > > stage.
> > > > > > > >
> > > > > > > > The current status of the Corstone-1000 SoC requires that there 
> > > > > > > > is
> > > > > > > > a preloaded firmware for the external core. Preloading is done 
> > > > > > > > externally
> > > > > > > > either through the FPGA bootloader or the emulator (FVP) before 
> > > > > > > > powering
> > > > > > > > on the SoC.
> > > > > > > >
> > > > > > >
> > > > > > > Ok
> > > > > > >
> > > > > > > > Corstone-1000 will be upgraded in a way that the A core running 
> > > > > > > > Linux is able
> > > > > > > > to share memory with the remote core and also being able to 
> > > > > > > > access the remote
> > > > > > > > core memory so Linux can copy the firmware to. This HW changes 
> > > > > > > > are still
> > > > > > > > This is why this patchset is relying on a preloaded firmware. 
> > > > > > > > And it's the step 1
> > > > > > > > of adding remoteproc support for Corstone.
> > > > > > > >
> > > > > > >
> > > > > > > Ok, so there is a HW problem where A core and M core can't see 
> > > > > > > each other's
> > > > > > > memory, preventing the A core from copying the firmware image to 
> > > > > > > the proper
> > > > > > > location.
> > > > > > >
> > > > > > > When the HW is fixed, will there be a need to support scenarios 
> > > > > > > where the
> > > > > > > firmware image has been preloaded into memory?
> > > > > >
> > > > > > No, this scenario won't apply when we get the HW upgrade. No need 
> > > > > > for an
> > > > > > external entity anymore. The firmware(s) will all be files in the 
> > > > > > linux filesystem.
> > > > > >
> > > > >
> > > > > Very well.  I am willing to continue with this driver but it does so 
> > > > > little that
> > > > > I wonder if it wouldn't simply be better to move forward with 
> > > > > upstreaming when
> > > > > the HW is fixed.  The choice is yours.
> > > > >
> > > >
> > > > I think Robin has raised few points that need clarification. I think it 
> > > > was
> > > > done as part of DT binding patch. I share those concerns and I wanted to
> > > > reaching to the same concerns by starting the questions I asked on 
> > > > corstone
> > > > device tree changes.
> > > >
> > >
> > > I also agree with Robin's point of view.  Proceeding with an initial
> > > driver with minimal functionality doesn't preclude having complete
> > > bindings.  But that said and as I pointed out, it might be better to
> > > wait for the HW to be fixed before moving forward.
> >
> > We checked with the HW teams. The missing features will be implemented but
> > this will take time.
> >
> > The foundation driver as it is right now is still valuable for people 
> > wanting to
> > know how to power control Corstone external systems in a future proof manner
> > (even in the incomplete state). We prefer to address all the review comments
> > made so it can be merged. This includes making the DT binding as complete as
> > possible as you advised. Then, once the HW is ready, I'll implement the 
> > comms
> > and the FW reload part. Is that OK please ?
> >
> 
> I'm in agreement with that plan as long as we agree the current
> preloaded heuristic is temporary and is not a valid long term
> scenario.

Yes, that's the plan, no problem.

Cheers,
Abdellatif



[PATCH v2 0/5] remoteproc: arm64: Introduce remoteproc support for Corstone-1000 External Systems

2024-08-22 Thread Abdellatif El Khlifi
The Corstone-1000 IoT Reference Design Platform [A] supports up to two External
Systems processors.

This patchset allows to control these processors through the remoteproc
subsystem.

The Corstone-1000 implements the SSE-710 subsystem [B] which defines the
MMIO-mapped reset registers (EXT_SYS*) used for controlling the
External Systems [C].

This patchset provides the following:

- Device tree bindings for the SSE-710 subsystem with syscon support
- Device tree bindings for the SSE-710 External System
- Corstone-1000 External System, syscon and MHUs device tree nodes
- Arm MHUv2 Mailbox [F] device tree nodes for Corstone-1000
- Corstone-1000 remoteproc driver with regmap support

For more details, please see the SSE-710 External System Remote
Processor bindings [D] and the SSE-710 Host Base System Control bindings [E].

[A]: https://developer.arm.com/Processors/Corstone-1000
[B]: https://developer.arm.com/documentation/102342/latest/
[C]: 
https://developer.arm.com/documentation/102342//Programmers-model/Register-descriptions/Host-Base-System-Control-register-summary
[D]: Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
[E]: Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml
[F]: Documentation/devicetree/bindings/mailbox/arm,mhuv2.yaml

Changelog:


v2:

* provide SSE-710 syscon bindings
* provide SSE-710 External System bindings
* add Corstone-1000 External System node under syscon
* add Arm MHUv2 Mailbox device tree nodes for Corstone-1000
* add regmap support for the driver
* use devm_rproc_* APIs
* refactoring

v1: [1]

* introduce the Corstone-1000 remoteproc support

List of previous patches:

[1]: 
https://lore.kernel.org/all/20240301164227.339208-1-abdellatif.elkhl...@arm.com/

Cheers,
Abdellatif

Abdellatif El Khlifi (5):
  dt-bindings: remoteproc: sse710: Add the External Systems remote
processors
  dt-bindings: arm: sse710: Add Host Base System Control
  arm64: dts: corstone1000: Add MHU nodes used by the External System
  arm64: dts: corstone1000: Add External System support
  remoteproc: arm64: corstone1000: Add the External Systems driver

 .../arm/arm,sse710-host-base-sysctrl.yaml |  56 +++
 .../remoteproc/arm,sse710-extsys.yaml |  90 +
 arch/arm64/boot/dts/arm/corstone1000.dtsi |  34 +-
 drivers/remoteproc/Kconfig|  14 +
 drivers/remoteproc/Makefile   |   1 +
 drivers/remoteproc/corstone1000_rproc.c   | 350 ++
 6 files changed, 544 insertions(+), 1 deletion(-)
 create mode 100644 
Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml
 create mode 100644 
Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
 create mode 100644 drivers/remoteproc/corstone1000_rproc.c


base-commit: 8fa052c29e509f3e47d56d7fc2ca28094d78c60a
-- 
2.25.1




[PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-08-22 Thread Abdellatif El Khlifi
Add devicetree binding schema for the External Systems remote processors

The External Systems remote processors are provided on the Corstone-1000
IoT Reference Design Platform via the SSE-710 subsystem.

For more details about the External Systems, please see Corstone SSE-710
subsystem features [1].

[1]: 
https://developer.arm.com/documentation/102360//Overview-of-Corstone-1000/Corstone-SSE-710-subsystem-features

Signed-off-by: Abdellatif El Khlifi 
---
 .../remoteproc/arm,sse710-extsys.yaml | 90 +++
 1 file changed, 90 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml

diff --git 
a/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml 
b/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
new file mode 100644
index ..827ba8d962f1
--- /dev/null
+++ b/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
@@ -0,0 +1,90 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/remoteproc/arm,sse710-extsys.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SSE-710 External System Remote Processor
+
+maintainers:
+  - Abdellatif El Khlifi 
+  - Hugues Kamba Mpiana 
+
+description: |
+  SSE-710 is an heterogeneous subsystem supporting up to two remote
+  processors aka the External Systems.
+
+properties:
+  compatible:
+enum:
+  - arm,sse710-extsys
+
+  firmware-name:
+description:
+  The default name of the firmware to load to the remote processor.
+
+  '#extsys-id':
+description:
+  The External System ID.
+enum: [0, 1]
+
+  mbox-names:
+items:
+  - const: txes0
+  - const: rxes0
+
+  mboxes:
+description:
+  The list of Message Handling Unit (MHU) channels used for bidirectional
+  communication. This property is only required if the virtio-based Rpmsg
+  messaging bus is used. For more details see the Arm MHUv2 Mailbox
+  Controller at devicetree/bindings/mailbox/arm,mhuv2.yaml
+
+minItems: 2
+maxItems: 2
+
+  memory-region:
+description:
+  If present, a phandle for a reserved memory area that used for vdev
+  buffer, resource table, vring region and others used by the remote
+  processor.
+minItems: 2
+maxItems: 32
+
+required:
+  - compatible
+  - firmware-name
+  - '#extsys-id'
+
+additionalProperties: false
+
+examples:
+  - |
+reserved-memory {
+#address-cells = <2>;
+#size-cells = <2>;
+
+extsys0_vring0: vdev0vring0@82001000 {
+reg = <0 0x82001000 0 0x8000>;
+no-map;
+};
+
+extsys0_vring1: vdev0vring1@82009000 {
+reg = <0 0x82009000 0 0x8000>;
+no-map;
+};
+};
+
+syscon@1a01 {
+compatible = "arm,sse710-host-base-sysctrl", "simple-mfd", "syscon";
+reg = <0x1a01 0x1000>;
+
+extsys0 {
+compatible = "arm,sse710-extsys";
+#extsys-id = <0>;
+firmware-name = "es_flashfw.elf";
+mbox-names = "txes0", "rxes0";
+mboxes = <&mhu0_hes0 0 1>, <&mhu0_es0h 0 1>;
+memory-region = <&extsys0_vring0>, <&extsys0_vring1>;
+};
+};
-- 
2.25.1




[PATCH v2 3/5] arm64: dts: corstone1000: Add MHU nodes used by the External System

2024-08-22 Thread Abdellatif El Khlifi
Add normal world mhu0_hes0 and mhu0_es0h nodes

In Corstone-1000 IoT Reference Design Platform, communication between the
host (Cortex-A35) running in normal world (EL0 and EL1) and the external
system (Cortex-M3) is done with MHU0.

MHU0 is a bidirectional communication channel described in the device tree
through mhu0_hes0 and mhu0_es0h.

Signed-off-by: Abdellatif El Khlifi 
---
 arch/arm64/boot/dts/arm/corstone1000.dtsi | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/arm/corstone1000.dtsi 
b/arch/arm64/boot/dts/arm/corstone1000.dtsi
index bb9b96fb5314..01c65195ca53 100644
--- a/arch/arm64/boot/dts/arm/corstone1000.dtsi
+++ b/arch/arm64/boot/dts/arm/corstone1000.dtsi
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0 OR MIT
 /*
- * Copyright (c) 2022, Arm Limited. All rights reserved.
+ * Copyright (c) 2022, 2024 Arm Limited. All rights reserved.
  * Copyright (c) 2022, Linaro Limited. All rights reserved.
  *
  */
@@ -134,6 +134,26 @@ uart1: serial@1a52 {
clock-names = "uartclk", "apb_pclk";
};
 
+   mhu0_hes0: mhu@1b00 {
+   compatible = "arm,mhuv2-tx","arm,primecell";
+   reg = <0x1b00 0x1000>;
+   clocks = <&refclk100mhz>;
+   clock-names = "apb_pclk";
+   interrupts = ;
+   #mbox-cells = <2>;
+   arm,mhuv2-protocols = <0 1>;
+   };
+
+   mhu0_es0h: mhu@1b01 {
+   compatible = "arm,mhuv2-rx","arm,primecell";
+   reg = <0x1b01 0x1000>;
+   clocks = <&refclk100mhz>;
+   clock-names = "apb_pclk";
+   interrupts = ;
+   #mbox-cells = <2>;
+   arm,mhuv2-protocols = <0 1>;
+   };
+
mhu_hse1: mailbox@1b82 {
compatible = "arm,mhuv2-tx", "arm,primecell";
reg = <0x1b82 0x1000>;
-- 
2.25.1




[PATCH v2 4/5] arm64: dts: corstone1000: Add External System support

2024-08-22 Thread Abdellatif El Khlifi
Add extsys0 remoteproc node as a child node of syscon

extsys0 describes the Corstone-1000 external system [1]
(the remote processor).

The host (Cortex-A35) can control the external system through memory mapped
registers located in a memory area called the
Host Base System Control [2][3]. This area is part of the host memory
space.

We use syscon to represent the Host Base System Control area and the
remoteproc node is a child node.

[1]: Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
[2]: 
https://developer.arm.com/documentation/102342//Programmers-model/Register-descriptions/Host-Base-System-Control-register-summary
[3]: Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml

Signed-off-by: Abdellatif El Khlifi 
---
 arch/arm64/boot/dts/arm/corstone1000.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/arm/corstone1000.dtsi 
b/arch/arm64/boot/dts/arm/corstone1000.dtsi
index 01c65195ca53..17d6638f9ca6 100644
--- a/arch/arm64/boot/dts/arm/corstone1000.dtsi
+++ b/arch/arm64/boot/dts/arm/corstone1000.dtsi
@@ -103,6 +103,18 @@ soc {
interrupt-parent = <&gic>;
ranges;
 
+   syscon@1a01 {
+   compatible = "arm,sse710-host-base-sysctrl",
+"simple-mfd", "syscon";
+   reg = <0x1a01 0x1000>;
+
+   extsys0 {
+   compatible = "arm,sse710-extsys";
+   #extsys-id = <0>;
+   firmware-name = "es_flashfw.elf";
+   };
+   };
+
timer@1a22 {
compatible = "arm,armv7-timer-mem";
reg = <0x1a22 0x1000>;
-- 
2.25.1




[PATCH v2 5/5] remoteproc: arm64: corstone1000: Add the External Systems driver

2024-08-22 Thread Abdellatif El Khlifi
Introduce remoteproc support for Corstone-1000 external systems

The Corstone-1000 IoT Reference Design Platform supports up to two
external systems processors. These processors can be switched on or off
using their reset registers.

For more details, please see the SSE-710 External System Remote
Processor binding [1] and the SSE-710 Host Base System Control binding [2].

The reset registers are MMIO mapped registers accessed using regmap.

[1]: Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
[2]: Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml

Signed-off-by: Abdellatif El Khlifi 
---
 drivers/remoteproc/Kconfig  |  14 +
 drivers/remoteproc/Makefile |   1 +
 drivers/remoteproc/corstone1000_rproc.c | 350 
 3 files changed, 365 insertions(+)
 create mode 100644 drivers/remoteproc/corstone1000_rproc.c

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 0f0862e20a93..a0ff5d4f2319 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -379,6 +379,20 @@ config XLNX_R5_REMOTEPROC
 
  It's safe to say N if not interested in using RPU r5f cores.
 
+config CORSTONE1000_REMOTEPROC
+   tristate "Arm Corstone-1000 remoteproc support"
+   depends on ARM64 || (HAS_IOMEM && COMPILE_TEST)
+   help
+ Say y here to support Arm Corstone-1000 remote processors via the
+ remote processor framework.
+
+ Corstone-1000 remote processors are controlled with a reset status
+ and control registers. The driver also supports control of multiple
+ remote cores at the same time.
+
+ It's safe to say N here if not interested in utilizing remote
+ processors.
+
 endif # REMOTEPROC
 
 endmenu
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 5ff4e2fee4ab..e017f75143e3 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -40,3 +40,4 @@ obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)+= 
ti_k3_dsp_remoteproc.o
 obj-$(CONFIG_TI_K3_M4_REMOTEPROC)  += ti_k3_m4_remoteproc.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)  += ti_k3_r5_remoteproc.o
 obj-$(CONFIG_XLNX_R5_REMOTEPROC)   += xlnx_r5_remoteproc.o
+obj-$(CONFIG_CORSTONE1000_REMOTEPROC)  += corstone1000_rproc.o
diff --git a/drivers/remoteproc/corstone1000_rproc.c 
b/drivers/remoteproc/corstone1000_rproc.c
new file mode 100644
index ..bf351af6a1c3
--- /dev/null
+++ b/drivers/remoteproc/corstone1000_rproc.c
@@ -0,0 +1,350 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Arm Corstone-1000 Remoteproc driver
+ *
+ * The driver adds remoteproc support for the external cores used in Arm
+ * Corstone-1000 IoT Reference Design Platform [1][2]
+ * [1] Arm Corstone-1000 Technical Overview: 
https://developer.arm.com/documentation/102360/
+ * [2] Arm Corstone SSE-710 Subsystem Technical Reference Manual: 
https://developer.arm.com/documentation/102342/
+ *
+ * Copyright (C) 2024 ARM Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "remoteproc_internal.h"
+
+/**
+ * struct corstone1000_rproc - Arm remote processor instance
+ * @rproc: rproc handler
+ * @regmap: MMIO register map
+ * @ctrl_reg: control register offset
+ * @state_reg: state register offset
+ */
+struct corstone1000_rproc {
+   struct rproc *rproc;
+   struct regmap *regmap;
+   u16 ctrl_reg;
+   u16 state_reg;
+};
+
+/* Definitions for Arm Corstone-1000 External System */
+
+/* External Systems identifiers */
+#define EXT_SYS0_ID(0) /* External System 0 ID */
+#define EXT_SYS1_ID(1) /* External System 1 ID */
+
+/* External System 0 registers offset */
+#define EXT_SYS0_RST_CTRL  (0x310) /* Reset Control register */
+#define EXT_SYS0_RST_ST(0x314) /* Reset Status 
register */
+
+/* External System 1 registers offset */
+#define EXT_SYS1_RST_CTRL  (0x318) /* Reset Control register */
+#define EXT_SYS1_RST_ST(0x31c) /* Reset Status 
register */
+
+/* External System Reset Control register bit definitions */
+#define EXTSYS_RST_CTRL_CPUWAITBIT(0)  /* CPU Wait control 
*/
+#define EXTSYS_RST_CTRL_RST_REQBIT(1)  /*Reset request */
+
+/* Status of reset request bits */
+#define EXTSYS_RST_ACK_MASKGENMASK(2, 1)
+#define GET_EXTSYS_RST_ST_RST_ACK(x)   ((u8)(FIELD_GET(EXTSYS_RST_ACK_MASK, \
+   (x
+
+/* Possible values for the Status of reset request */
+#define EXTSYS_RST_ACK_NO_RESET_REQ(0x0)
+#define EXTSYS_RST_ACK_NOT_COMPLETE(0x1)
+#define EXTSYS_RST_ACK_COMPLETE(0x2)
+#define EXTSYS_RST_ACK_RESERVED(0x3)
+
+/* Polling settings used when reading th

[PATCH v2 2/5] dt-bindings: arm: sse710: Add Host Base System Control

2024-08-22 Thread Abdellatif El Khlifi
Add devicetree binding schema for the SSE-710 Host Base System Control

SSE-710 is implemented by the Corstone-1000 IoT Reference Design
Platform [1].

The Host Base System Control has registers to control the clocks, power,
and reset for SSE-710 subsystem [2]. It resides within AONTOP power domain.
The registers are mapped under the SSE-710 Host System memory map [3].

[1]: https://developer.arm.com/Processors/Corstone-1000
[2]: https://developer.arm.com/documentation/102342/latest/
[3]: 
https://developer.arm.com/documentation/102342//Programmers-model/Register-descriptions/Host-Base-System-Control-register-summary

Signed-off-by: Abdellatif El Khlifi 
---
 .../arm/arm,sse710-host-base-sysctrl.yaml | 56 +++
 1 file changed, 56 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml

diff --git 
a/Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml 
b/Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml
new file mode 100644
index ..e344a73e329d
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/arm,sse710-host-base-sysctrl.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SSE-710 Host Base System Control
+
+maintainers:
+  - Abdellatif El Khlifi 
+  - Hugues Kamba Mpiana 
+
+description: |+
+  The Host Base System Control has registers to control the clocks, power, and
+  reset for SSE-710 subsystem. It resides within AONTOP power domain.
+  The registers are mapped under the SSE-710 Host System memory map.
+
+properties:
+  compatible:
+items:
+  - enum:
+  - arm,sse710-host-base-sysctrl
+  - const: simple-mfd
+  - const: syscon
+
+  reg:
+maxItems: 1
+
+patternProperties:
+  "^extsys[0-1]$":
+description:
+  SSE-710 subsystem supports up to two External Systems.
+$ref: /schemas/remoteproc/arm,sse710-extsys.yaml#
+unevaluatedProperties: false
+
+additionalProperties: false
+
+required:
+  - compatible
+  - reg
+
+examples:
+  - |
+syscon@1a01 {
+compatible = "arm,sse710-host-base-sysctrl", "simple-mfd", "syscon";
+reg = <0x1a01 0x1000>;
+
+extsys0 {
+compatible = "arm,sse710-extsys";
+firmware-name = "es_flashfw.elf";
+#extsys-id = <0>;
+mbox-names = "txes0", "rxes0";
+mboxes = <&mhu0_hes0 0 1>, <&mhu0_es0h 0 1>;
+memory-region = <&extsys0_vring0>, <&extsys0_vring1>;
+};
+};
-- 
2.25.1




[PATCH] Staging: rtl8192e: fix line length coding style issue in rtllib_softmac.c

2016-03-18 Thread Yousof El-Sayed
This is a patch to the rtllib_softmac.c file that fixes up all instances of the 
'line over 80 characters' warnings found by the checkpatch.pl tool

Signed-off-by: Yousof El-Sayed 
---
 drivers/staging/rtl8192e/rtllib_softmac.c | 35 +++
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtllib_softmac.c 
b/drivers/staging/rtl8192e/rtllib_softmac.c
index cfab715..9ba2230 100644
--- a/drivers/staging/rtl8192e/rtllib_softmac.c
+++ b/drivers/staging/rtl8192e/rtllib_softmac.c
@@ -389,7 +389,8 @@ static void rtllib_send_beacon(struct rtllib_device *ieee)
 
if (ieee->beacon_txing && ieee->ieee_up)
mod_timer(&ieee->beacon_timer, jiffies +
- 
(msecs_to_jiffies(ieee->current_network.beacon_interval - 5)));
+ (msecs_to_jiffies
+  (ieee->current_network.beacon_interval - 5)));
 }
 
 
@@ -601,7 +602,8 @@ static void rtllib_softmac_scan_wq(void *data)
(ieee->current_network.channel + 1) %
MAX_CHANNEL_NUMBER;
if (ieee->scan_watch_dog++ > MAX_CHANNEL_NUMBER) {
-   if 
(!ieee->active_channel_map[ieee->current_network.channel])
+   if (!ieee->active_channel_map[ieee->
+   current_network.channel])
ieee->current_network.channel = 6;
goto out; /* no good chans */
}
@@ -1716,8 +1718,9 @@ inline void rtllib_softmac_new_net(struct rtllib_device 
*ieee,
if (ieee->iw_mode == IW_MODE_INFRA) {
/* Join the network for the first time */
ieee->AsocRetryCount = 0;
-   if ((ieee->current_network.qos_data.supported 
== 1) &&
-   ieee->current_network.bssht.bdSupportHT)
+   if ((ieee->current_network.qos_data.supported 
+ == 1) && 
+  ieee->current_network.bssht.bdSupportHT)
HTResetSelfAndSavePeerSetting(ieee,
 &(ieee->current_network));
else
@@ -2044,8 +2047,8 @@ static short rtllib_sta_ps_sleep(struct rtllib_device 
*ieee, u64 *time)
}
 
*time = ieee->current_network.last_dtim_sta_time
-   + 
msecs_to_jiffies(ieee->current_network.beacon_interval *
-   LPSAwakeIntvl_tmp);
+   + msecs_to_jiffies(ieee->
+   current_network.beacon_interval * LPSAwakeIntvl_tmp);
}
}
 
@@ -2237,11 +2240,15 @@ inline int rtllib_rx_assoc_resp(struct rtllib_device 
*ieee, struct sk_buff *skb,
ieee->assoc_id = aid;
ieee->softmac_stats.rx_ass_ok++;
/* station support qos */
-   /* Let the register setting default with Legacy station 
*/
-   assoc_resp = (struct rtllib_assoc_response_frame 
*)skb->data;
+   /* Let the register setting default */
+   /*  with Legacy station */
+   assoc_resp = (struct 
+   rtllib_assoc_response_frame *)skb->data;
if (ieee->current_network.qos_data.supported == 1) {
-   if (rtllib_parse_info_param(ieee, 
assoc_resp->info_element,
-   rx_stats->len - 
sizeof(*assoc_resp),
+   if (rtllib_parse_info_param
+   (ieee, assoc_resp->info_element,
+   rx_stats->len - sizeof
+   (*assoc_resp),
network, rx_stats)) {
kfree(network);
return 1;
@@ -2254,8 +2261,9 @@ inline int rtllib_rx_assoc_resp(struct rtllib_device 
*ieee, struct sk_buff *skb,
   network->bssht.bdHTInfoLen);
if (ieee->handle_assoc_response != NULL)
ieee->handle_assoc_response(ieee->dev,
-(struct 
rtllib_assoc_response_frame *)header,
-network);
+(struct 
+ rtlli

[PATCH] staging: rtl8192e: fixed coding style issues

2016-03-19 Thread Yousof El-Sayed
Signed-off-by: Yousof El-Sayed 
---
 drivers/staging/rtl8192e/dot11d.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/rtl8192e/dot11d.c 
b/drivers/staging/rtl8192e/dot11d.c
index 4d8fb41..a08bfef 100644
--- a/drivers/staging/rtl8192e/dot11d.c
+++ b/drivers/staging/rtl8192e/dot11d.c
@@ -50,10 +50,9 @@ void dot11d_init(struct rtllib_device *ieee)
 
pDot11dInfo->State = DOT11D_STATE_NONE;
pDot11dInfo->CountryIeLen = 0;
-   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER+1);
-   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER+1);
+   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER + 1);
+   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER + 1);
RESET_CIE_WATCHDOG(ieee);
-
 }
 EXPORT_SYMBOL(dot11d_init);
 
@@ -99,14 +98,13 @@ void Dot11d_Channelmap(u8 channel_plan, struct 
rtllib_device *ieee)
 }
 EXPORT_SYMBOL(Dot11d_Channelmap);
 
-
 void Dot11d_Reset(struct rtllib_device *ieee)
 {
struct rt_dot11d_info *pDot11dInfo = GET_DOT11D_INFO(ieee);
u32 i;
 
-   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER+1);
-   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER+1);
+   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER + 1);
+   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER + 1);
for (i = 1; i <= 11; i++)
(pDot11dInfo->channel_map)[i] = 1;
for (i = 12; i <= 14; i++)
@@ -123,8 +121,8 @@ void Dot11d_UpdateCountryIe(struct rtllib_device *dev, u8 
*pTaddr,
u8 i, j, NumTriples, MaxChnlNum;
struct chnl_txpow_triple *pTriple;
 
-   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER+1);
-   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER+1);
+   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER + 1);
+   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER + 1);
MaxChnlNum = 0;
NumTriples = (CoutryIeLen - 3) / 3;
pTriple = (struct chnl_txpow_triple *)(pCoutryIe + 3);
-- 
1.8.3.1



Kconfig error: Missing dependency for MEMSTICK_UNSAFE_RESUME

2016-06-14 Thread Sascha El-Sharkawy
Dear Kernel developers,

we detected a missing dependency inside the Kconfig model, which allows to 
configure Memstick support for power management (MEMSTICK_UNSAFE_RESUME) even 
if Power Management (PM) was disabled. We suggest to add a "depends on" 
constraint to the Kconfig model for semantical correctness and to simplify the 
configuration process by avoiding the configuration of unnecessary 
configuration options (patch is attached to this mail).
Without this dependency, a configuring user is not able to detect the 
divergence between the desired and real behavior of the kernel.
We also wrote a Bugzilla report for more details: 
https://bugzilla.kernel.org/show_bug.cgi?id=116871

Sincerely yours,
Sascha El-Sharkawy

-- ---
Sascha El-Sharkawy, MSc 
University of HildesheimTel.: +49 (0) 5121 / 883-40336  
Institute of Computer Science   Fax:  +49 (0) 5121 / 883-40337
Universitaetsplatz 1       els...@sse.uni-hildesheim.de
D-31141 Hildesheim, Germany   http://www.sse.uni-hildesheim.de



MEMSTICK_UNSAFE_RESUME.patch
Description: MEMSTICK_UNSAFE_RESUME.patch


Re: Dear Friend, (BUSINESS PROPOSAL)

2016-03-14 Thread Adnan El-Sheriff
Hello,
My Names are Adnan El-Sheriff, A businessman from Jordan, real estate developer 
and contractor with projects.
It is my pleasure to contact you for a business joint venture partnership which 
I intend to establish in any country where business climate is good and 
lucrative. Though I have not met with you before but I believe, one has to risk 
confiding in someone to succeed sometimes in life. I need to invest hugely 
round the world now that the money is available and that I have the opportunity,
I will like to invest into these three key areas in any part of your country. 
If there is any other business that is better than what I am listing below, 
please kindly guide me for I will be very glad to follow your advice.
1). Real estate
2). the transport industry
3). General Trading.

If you are ready to do a joint business partnership with me and you can speak 
English language, then you very much have what it takes to deliver, then you 
are my kind of person and therefore I will like to hear from you.
I await your response soonest

Warmest regards,

Adnan El-Sheriff


Re: [PATCH] staging: rtl8192e: fixed coding style issues

2016-03-19 Thread Yousof El-Sayed
Hi,

Thank you for the email, apologies for that I'll get that sorted out now.

Thanks again

On Thu, Mar 17, 2016 at 10:11:18AM -0700, Greg KH wrote:
> On Thu, Mar 17, 2016 at 04:55:37PM +, Yousof El-Sayed wrote:
> > Signed-off-by: Yousof El-Sayed 
> 
> I can't take patches without any changelog entry, sorry.
> 
> And be specific about what and why you are changing anything, "coding
> style issues" is very vague.
> 
> thanks,
> 
> greg k-h


[PATCH] staging: rtl8192e: fixed coding style issues

2016-03-19 Thread Yousof El-Sayed
staging: rtl8192e - dot11d.c

[patch 1/2] Fixed throughout:
spaces preferred around that '+' (ctx:VxV)

[patch 2/2] Fixed throughout:
Please don't use multiple blank lines

Signed-off-by: Yousof El-Sayed 
---
 drivers/staging/rtl8192e/dot11d.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/rtl8192e/dot11d.c 
b/drivers/staging/rtl8192e/dot11d.c
index 4d8fb41..a08bfef 100644
--- a/drivers/staging/rtl8192e/dot11d.c
+++ b/drivers/staging/rtl8192e/dot11d.c
@@ -50,10 +50,9 @@ void dot11d_init(struct rtllib_device *ieee)
 
pDot11dInfo->State = DOT11D_STATE_NONE;
pDot11dInfo->CountryIeLen = 0;
-   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER+1);
-   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER+1);
+   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER + 1);
+   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER + 1);
RESET_CIE_WATCHDOG(ieee);
-
 }
 EXPORT_SYMBOL(dot11d_init);
 
@@ -99,14 +98,13 @@ void Dot11d_Channelmap(u8 channel_plan, struct 
rtllib_device *ieee)
 }
 EXPORT_SYMBOL(Dot11d_Channelmap);
 
-
 void Dot11d_Reset(struct rtllib_device *ieee)
 {
struct rt_dot11d_info *pDot11dInfo = GET_DOT11D_INFO(ieee);
u32 i;
 
-   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER+1);
-   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER+1);
+   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER + 1);
+   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER + 1);
for (i = 1; i <= 11; i++)
(pDot11dInfo->channel_map)[i] = 1;
for (i = 12; i <= 14; i++)
@@ -123,8 +121,8 @@ void Dot11d_UpdateCountryIe(struct rtllib_device *dev, u8 
*pTaddr,
u8 i, j, NumTriples, MaxChnlNum;
struct chnl_txpow_triple *pTriple;
 
-   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER+1);
-   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER+1);
+   memset(pDot11dInfo->channel_map, 0, MAX_CHANNEL_NUMBER + 1);
+   memset(pDot11dInfo->MaxTxPwrDbmList, 0xFF, MAX_CHANNEL_NUMBER + 1);
MaxChnlNum = 0;
NumTriples = (CoutryIeLen - 3) / 3;
pTriple = (struct chnl_txpow_triple *)(pCoutryIe + 3);
-- 
1.8.3.1



[PATCH] Staging: rtl8192e: fix line length coding style issue in rtllib_softmac.c

2016-03-19 Thread Yousof El-Sayed
This is a patch to the rtllib_softmac.c file that fixes up all instances of
 the 'line over 80 characters' warnings found by the checkpatch.pl tool.

Signed-off-by: Yousof El-Sayed 
---
 drivers/staging/rtl8192e/rtllib_softmac.c | 35 +++
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtllib_softmac.c 
b/drivers/staging/rtl8192e/rtllib_softmac.c
index cfab715..9ba2230 100644
--- a/drivers/staging/rtl8192e/rtllib_softmac.c
+++ b/drivers/staging/rtl8192e/rtllib_softmac.c
@@ -389,7 +389,8 @@ static void rtllib_send_beacon(struct rtllib_device *ieee)
 
if (ieee->beacon_txing && ieee->ieee_up)
mod_timer(&ieee->beacon_timer, jiffies +
- 
(msecs_to_jiffies(ieee->current_network.beacon_interval - 5)));
+ (msecs_to_jiffies
+  (ieee->current_network.beacon_interval - 5)));
 }
 
 
@@ -601,7 +602,8 @@ static void rtllib_softmac_scan_wq(void *data)
(ieee->current_network.channel + 1) %
MAX_CHANNEL_NUMBER;
if (ieee->scan_watch_dog++ > MAX_CHANNEL_NUMBER) {
-   if 
(!ieee->active_channel_map[ieee->current_network.channel])
+   if (!ieee->active_channel_map[ieee->
+   current_network.channel])
ieee->current_network.channel = 6;
goto out; /* no good chans */
}
@@ -1716,8 +1718,9 @@ inline void rtllib_softmac_new_net(struct rtllib_device 
*ieee,
if (ieee->iw_mode == IW_MODE_INFRA) {
/* Join the network for the first time */
ieee->AsocRetryCount = 0;
-   if ((ieee->current_network.qos_data.supported 
== 1) &&
-   ieee->current_network.bssht.bdSupportHT)
+   if ((ieee->current_network.qos_data.supported 
+ == 1) && 
+  ieee->current_network.bssht.bdSupportHT)
HTResetSelfAndSavePeerSetting(ieee,
 &(ieee->current_network));
else
@@ -2044,8 +2047,8 @@ static short rtllib_sta_ps_sleep(struct rtllib_device 
*ieee, u64 *time)
}
 
*time = ieee->current_network.last_dtim_sta_time
-   + 
msecs_to_jiffies(ieee->current_network.beacon_interval *
-   LPSAwakeIntvl_tmp);
+   + msecs_to_jiffies(ieee->
+   current_network.beacon_interval * LPSAwakeIntvl_tmp);
}
}
 
@@ -2237,11 +2240,15 @@ inline int rtllib_rx_assoc_resp(struct rtllib_device 
*ieee, struct sk_buff *skb,
ieee->assoc_id = aid;
ieee->softmac_stats.rx_ass_ok++;
/* station support qos */
-   /* Let the register setting default with Legacy station 
*/
-   assoc_resp = (struct rtllib_assoc_response_frame 
*)skb->data;
+   /* Let the register setting default */
+   /*  with Legacy station */
+   assoc_resp = (struct 
+   rtllib_assoc_response_frame *)skb->data;
if (ieee->current_network.qos_data.supported == 1) {
-   if (rtllib_parse_info_param(ieee, 
assoc_resp->info_element,
-   rx_stats->len - 
sizeof(*assoc_resp),
+   if (rtllib_parse_info_param
+   (ieee, assoc_resp->info_element,
+   rx_stats->len - sizeof
+   (*assoc_resp),
network, rx_stats)) {
kfree(network);
return 1;
@@ -2254,8 +2261,9 @@ inline int rtllib_rx_assoc_resp(struct rtllib_device 
*ieee, struct sk_buff *skb,
   network->bssht.bdHTInfoLen);
if (ieee->handle_assoc_response != NULL)
ieee->handle_assoc_response(ieee->dev,
-(struct 
rtllib_assoc_response_frame *)header,
-network);
+(struct 
+ rtlli

PROJECT FUNDING/DEBT FINANCING

2015-06-27 Thread Mohammed El-Shaban
Greetings,

We are an Investment company that invites you to partner with us and benefit in 

our new Loan and Project funding program. We offer flexible loans and funding 
for 

various projects by passing the usual rigorous procedures.This Funding program 

allows a client to enjoy low interest payback for as low as 3 - 4% per annum 
for a 

period of 7-8 years. We can approve a loan/funding for up to USD 500,000,000.00 
or 

more depending on the nature of business. We are currently funding for:

* Starting up a Franchise
* Business Acquisition
* Business Expansion
* Commercial Real Estate purchase
* Contract Execution

We are open to having a good business relationship with you. If you think you 
have 

a solid background and idea of making good profit in any venture, please do not 

hesitate to contact us for possible business co-operation.

Sincerely,
El-Shaban
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-10-09 Thread Abdellatif El Khlifi
Hello folks,

> On 22/08/2024 6:09 pm, Abdellatif El Khlifi wrote:
> > Add devicetree binding schema for the External Systems remote processors
> > 
> > The External Systems remote processors are provided on the Corstone-1000
> > IoT Reference Design Platform via the SSE-710 subsystem.
> > 
> > For more details about the External Systems, please see Corstone SSE-710
> > subsystem features [1].
> > 
> > [1]: 
> > https://developer.arm.com/documentation/102360//Overview-of-Corstone-1000/Corstone-SSE-710-subsystem-features
> > 
> > Signed-off-by: Abdellatif El Khlifi 
> > ---
> >   .../remoteproc/arm,sse710-extsys.yaml | 90 +++
> >   1 file changed, 90 insertions(+)
> >   create mode 100644 
> > Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml 
> > b/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> > new file mode 100644
> > index ..827ba8d962f1
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> > @@ -0,0 +1,90 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/remoteproc/arm,sse710-extsys.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: SSE-710 External System Remote Processor
> 
> Thing is, this is not describing SSE-710. As far as I can work out, it is
> describing the firmware and hardware that a particular example
> implementation of the Corstone-1000 kit has chosen to put in the "external
> system" hole in the SSE-710 within that kit.
> 
> If I license SSE-710 alone or even the Corstone-1000 kit, I can put whatever
> I want in *my* implementation of those subsystems, so there clearly cannot
> possibly be a common binding for that.
> 
> For instance what if I decide to combine a Cortex-M core plus a radio and
> some other glue as my external subsystem? Do we have dozens of remoteproc
> bindings and drivers for weird fixed-function remoteprocs whose
> "firmware-name" implies a Bluetooth protocol stack? No, we treat them as
> Bluetooth controller devices. Look at
> devicetree/bindings/sound/fsl,rpmsg.yaml - it's even unashamedly an rpmsg
> client, but it's still not abusing the remoteproc subsystem because its
> function to the host OS is as an audio controller, not an arbitrarily
> configurable processor.
> 
> As I said before, all SSE-710 actually implements is a reset mechanism, so
> it only seems logical to model it as a reset controller, e.g. something
> like:
> 
>   hbsys: syscon@xyz {
>   compatible = "arm,sse710-host-base-sysctrl", "syscon";
>   reg = ;
>   #reset-cells = <1>;
>   };
> 
>   something {
>   ...
>   resets = <&hbsys 0>;
>   };
> 
>   something-else {
>   ...
>   resets = <&hbsys 1>;
>   };
> 
> 
> Then if there is actually any meaningful functionality in the default
> extsys0 firmware preloaded on the FPGA setup then define a binding for
> "arm,corstone1000-an550-extsys0" to describe whatever that actually does. If
> a user chooses to create and load their own different firmware, they're
> going to need their own binding and driver for whatever *that* firmware
> does.
> 
> FWIW, driver-wise the mapping to the reset API seems straightforward -
> .assert hits RST_REQ, .deassert clears CPUWAIT (.status is possibly a
> combination of CPUWAIT and RST_ACK?)

We are happy to follow what Robin recommended.

This can be summarized in two parts:

Part 1: Writing an SSE-710 reset controller driver

An SSE-710 reset controller driver that switches on/off the external system.
The driver will be helpful for products using SSE-710. So whoever licenses
Corstone-1000 or SSE-710 will find the reset controller driver helpful.
They can use it with their implementation of the external system.

Note: It's likely that the external systems the end user will be using in
their products will be different from the Corstone-1000 external system
given as an example. Differences in the memory configuration, subsystem
involved, boot roms configurations, ...
These differences mean that the end user will need to write their own driver
which might or might not be a remoteproc driver (e.g: Bluetooth, audio, 
...).

Part 2: Corstone-1000 remoteproc driver

Corstone-1000 HW is being upgraded to support m

Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-09-23 Thread Abdellatif El Khlifi
Hi Krzysztof,

> > +  '#extsys-id':
> 
>  '#' is not correct for sure, that's not a cell specifier.
> 
>  But anyway, we do not accept in general instance IDs.
> >>>
> >>> I'm happy to replace the instance ID with  another solution.
> >>> In our case the remoteproc instance does not have a base address
> >>> to use. So, we can't put remoteproc@address
> >>>
> >>> What do you recommend in this case please ?
> >>
> >> Waiting one month to respond is a great way to drop all context from my
> >> memory. The emails are not even available for me - gone from inbox.
> >>
> >> Bus addressing could note it. Or you have different devices, so
> >> different compatibles. Tricky to say, because you did not describe the
> >> hardware really and it's one month later...
> >>
> >
> > Sorry for waiting. I was in holidays.
> >
> > I'll add more documentation about the external system for more clarity 
> > [1].
> >
> > Basically, Linux runs on the Cortex-A35. The External system is a
> > Cortex-M core. The Cortex-A35 can not access the memory of the Cortex-M.
> > It can only control Cortex-M core using the reset control and status 
> > registers mapped
> > in the memory space of the Cortex-A35.
> 
>  That's pretty standard.
> 
>  It does not explain me why bus addressing or different compatible are
>  not sufficient here.
> >>>
> >>> Using an instance ID was a design choice.
> >>> I'm happy to replace it with the use of compatible and match data (WIP).
> >>>
> >>> The match data will be pointing to a data structure containing the right 
> >>> offsets
> >>> to be used with regmap APIs.
> >>>
> >>> syscon node is used to represent the Host Base System Control register 
> >>> area [1]
> >>> where the external system reset registers are mapped (EXT_SYS*).
> >>>
> >>> The nodes will look like this:
> >>>
> >>> syscon@1a01 {
> >>> compatible = "arm,sse710-host-base-sysctrl", "simple-mfd", 
> >>> "syscon";
> >>> reg = <0x1a01 0x1000>;
> >>>
> >>> #address-cells = <1>;
> >>> #size-cells = <1>;
> >>>
> >>> remoteproc@310 {
> >>> compatible = "arm,sse710-extsys0";
> >>> reg = <0x310 4>;
> >>
> >> Uh, why do you create device nodes for one word? This really suggests it
> >> is part of parent device and your split is artificial.
> > 
> > The external system registers (described by the remoteproc node) are part
> > of the parent device (the Host Base System Control register area) described
> > by syscon.
> > 
> > In case of the external system 0 , its registers are located at offset 0x310
> > (physical address: 0x1a010310)
> > 
> > When instantiating the devices without @address, the DTC compiler
> > detects 2 nodes with the same name (remoteproc).
> 
> There should be no children at all. DT is not for instantiating your
> drivers. I claim you have only one device and that's
> arm,sse710-host-base-sysctrl. If you create child node for one word,
> that's not a device.

The Host Base System Control [3] is the big block containing various 
functionalities (MMIO registers).
Among the functionalities, the two remote cores registers (aka External system 
0 and 1).
The remote cores have two registers each.

1/ In the v1 patchset, a valid point was made by the community:

   Right now it seems somewhat tenuous to describe two consecutive
   32-bit registers as separate "reg" entries, but *maybe* it's OK if that's
   all there ever is. However if it's actually going to end up needing several
   more additional MMIO and/or memory regions for other functionality, then
   describing each register and location individually is liable to get
   unmanageable really fast, and a higher-level functional grouping (e.g. these
   reset-related registers together as a single 8-byte region) would likely be
   a better design.

   The Exernal system registers are part of a bigger block with other 
functionality in place.
   MFD/syscon might be better way to use these registers. You never know in
   future you might want to use another set of 2-4 registers with a different
   functionality in another driver.

   I would see if it makes sense to put together a single binding for
   this "Host Base System Control" register (not sure what exactly that means).
   Use MFD/regmap you access parts of this block. The remoteproc driver can
   then be semi-generic (meaning applicable to group of similar platforms)
   based on the platform compatible and use this regmap to provide the
   functionality needed.

2/ There are many examples in the kernel that use syscon as a parent node of
   child nodes for devices located at an offset from the syscon base address.
   Please see these two examples [1][2]. I'm trying to follow a similar design 
if that
   makes sense.

3/ Since there are two registers for each remote core. I'm suggesting to set the
   

Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-09-23 Thread Abdellatif El Khlifi
Hi Krzysztof,

> >>> +  '#extsys-id':
> >>
> >> '#' is not correct for sure, that's not a cell specifier.
> >>
> >> But anyway, we do not accept in general instance IDs.
> >
> > I'm happy to replace the instance ID with  another solution.
> > In our case the remoteproc instance does not have a base address
> > to use. So, we can't put remoteproc@address
> >
> > What do you recommend in this case please ?
> 
>  Waiting one month to respond is a great way to drop all context from 
>  my
>  memory. The emails are not even available for me - gone from inbox.
> 
>  Bus addressing could note it. Or you have different devices, so
>  different compatibles. Tricky to say, because you did not describe 
>  the
>  hardware really and it's one month later...
> 
> >>>
> >>> Sorry for waiting. I was in holidays.
> >>>
> >>> I'll add more documentation about the external system for more 
> >>> clarity [1].
> >>>
> >>> Basically, Linux runs on the Cortex-A35. The External system is a
> >>> Cortex-M core. The Cortex-A35 can not access the memory of the 
> >>> Cortex-M.
> >>> It can only control Cortex-M core using the reset control and status 
> >>> registers mapped
> >>> in the memory space of the Cortex-A35.
> >>
> >> That's pretty standard.
> >>
> >> It does not explain me why bus addressing or different compatible are
> >> not sufficient here.
> >
> > Using an instance ID was a design choice.
> > I'm happy to replace it with the use of compatible and match data (WIP).
> >
> > The match data will be pointing to a data structure containing the 
> > right offsets
> > to be used with regmap APIs.
> >
> > syscon node is used to represent the Host Base System Control register 
> > area [1]
> > where the external system reset registers are mapped (EXT_SYS*).
> >
> > The nodes will look like this:
> >
> > syscon@1a01 {
> > compatible = "arm,sse710-host-base-sysctrl", "simple-mfd", 
> > "syscon";
> > reg = <0x1a01 0x1000>;
> >
> > #address-cells = <1>;
> > #size-cells = <1>;
> >
> > remoteproc@310 {
> > compatible = "arm,sse710-extsys0";
> > reg = <0x310 4>;
> 
>  Uh, why do you create device nodes for one word? This really suggests it
>  is part of parent device and your split is artificial.
> >>>
> >>> The external system registers (described by the remoteproc node) are part
> >>> of the parent device (the Host Base System Control register area) 
> >>> described
> >>> by syscon.
> >>>
> >>> In case of the external system 0 , its registers are located at offset 
> >>> 0x310
> >>> (physical address: 0x1a010310)
> >>>
> >>> When instantiating the devices without @address, the DTC compiler
> >>> detects 2 nodes with the same name (remoteproc).
> >>
> >> There should be no children at all. DT is not for instantiating your
> >> drivers. I claim you have only one device and that's
> >> arm,sse710-host-base-sysctrl. If you create child node for one word,
> >> that's not a device.
> > 
> > The Host Base System Control [3] is the big block containing various 
> > functionalities (MMIO registers).
> > Among the functionalities, the two remote cores registers (aka External 
> > system 0 and 1).
> > The remote cores have two registers each.
> > 
> > 1/ In the v1 patchset, a valid point was made by the community:
> > 
> >Right now it seems somewhat tenuous to describe two consecutive
> >32-bit registers as separate "reg" entries, but *maybe* it's OK if that's
> 
> ARM is not special, neither this hardware is. Therefore:
> 1. Each register as reg: nope, for obvious reasons.
> 2. One device for entire syscon: quite common, why do you think it is
> somehow odd?
> 3. If you quote other person, please provide the lore link, so I won't
> spend useless 5 minutes to find who said that or if it was even said...

Please have a look at this lore link [1]. The idea is to add syscon
and regmap support which I did in the v2 patchset.

[1]: https://lore.kernel.org/all/ZfMVcQsmgQUXXcef@bogus/

> 
> >all there ever is. However if it's actually going to end up needing 
> > several
> >more additional MMIO and/or memory regions for other functionality, then
> >describing each register and location individually is liable to get
> >unmanageable really fast, and a higher-level functional grouping (e.g. 
> > these
> >reset-related registers together as a single 8-byte region) would likely 
> > be
> >a better design.
> > 
> >The Exernal system registers are part of a bigger block with other 
> > functionality in place.
> >MFD/syscon might be better way to use these registers. You never know in
> >future you mi

Re: [PATCH v2 5/5] remoteproc: arm64: corstone1000: Add the External Systems driver

2024-09-18 Thread Abdellatif El Khlifi
Hi Mathieu,

> Introduce remoteproc support for Corstone-1000 external systems
> 
> The Corstone-1000 IoT Reference Design Platform supports up to two
> external systems processors. These processors can be switched on or off
> using their reset registers.
> 
> For more details, please see the SSE-710 External System Remote
> Processor binding [1] and the SSE-710 Host Base System Control binding [2].
> 
> The reset registers are MMIO mapped registers accessed using regmap.
> 
> [1]: Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> [2]: Documentation/devicetree/bindings/arm/arm,sse710-host-base-sysctrl.yaml
> 
> Signed-off-by: Abdellatif El Khlifi 
> ---
>  drivers/remoteproc/Kconfig  |  14 +
>  drivers/remoteproc/Makefile |   1 +
>  drivers/remoteproc/corstone1000_rproc.c | 350 
>  3 files changed, 365 insertions(+)

A gentle reminder about reviewing the driver please.

I'll be addressing the comments made for the bindings.

Thank you in advance.

Cheers,
Abdellatif



Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-09-20 Thread Abdellatif El Khlifi
Hi Krzysztof,

> > +  '#extsys-id':
> 
>  '#' is not correct for sure, that's not a cell specifier.
> 
>  But anyway, we do not accept in general instance IDs.
> >>>
> >>> I'm happy to replace the instance ID with  another solution.
> >>> In our case the remoteproc instance does not have a base address
> >>> to use. So, we can't put remoteproc@address
> >>>
> >>> What do you recommend in this case please ?
> >>
> >> Waiting one month to respond is a great way to drop all context from my
> >> memory. The emails are not even available for me - gone from inbox.
> >>
> >> Bus addressing could note it. Or you have different devices, so
> >> different compatibles. Tricky to say, because you did not describe the
> >> hardware really and it's one month later...
> >>
> > 
> > Sorry for waiting. I was in holidays.
> > 
> > I'll add more documentation about the external system for more clarity [1].
> > 
> > Basically, Linux runs on the Cortex-A35. The External system is a
> > Cortex-M core. The Cortex-A35 can not access the memory of the Cortex-M.
> > It can only control Cortex-M core using the reset control and status 
> > registers mapped
> > in the memory space of the Cortex-A35.
> 
> That's pretty standard.
> 
> It does not explain me why bus addressing or different compatible are
> not sufficient here.

Using an instance ID was a design choice.
I'm happy to replace it with the use of compatible and match data (WIP).

The match data will be pointing to a data structure containing the right offsets
to be used with regmap APIs.

syscon node is used to represent the Host Base System Control register area [1]
where the external system reset registers are mapped (EXT_SYS*).

The nodes will look like this:

syscon@1a01 {
compatible = "arm,sse710-host-base-sysctrl", "simple-mfd", "syscon";
reg = <0x1a01 0x1000>;

#address-cells = <1>;
#size-cells = <1>;

remoteproc@310 {
compatible = "arm,sse710-extsys0";
reg = <0x310 4>;
...
}

remoteproc@318 {
compatible = "arm,sse710-extsys1";
reg = <0x318 4>;
...
}


[1]: 
https://developer.arm.com/documentation/102342//Programmers-model/Register-descriptions/Host-Base-System-Control-register-summary

Cheers
Abdellatif



Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-09-19 Thread Abdellatif El Khlifi
Hi Krzysztof,

> > Add devicetree binding schema for the External Systems remote processors
> > 
> > The External Systems remote processors are provided on the Corstone-1000
> > IoT Reference Design Platform via the SSE-710 subsystem.
> > 
> > For more details about the External Systems, please see Corstone SSE-710
> > subsystem features [1].
> > 
> 
> Do not attach (thread) your patchsets to some other threads (unrelated
> or older versions). This buries them deep in the mailbox and might
> interfere with applying entire sets.
> 
> > [1]: 
> > https://developer.arm.com/documentation/102360//Overview-of-Corstone-1000/Corstone-SSE-710-subsystem-features
> > 
> > Signed-off-by: Abdellatif El Khlifi 
> > ---
> >  .../remoteproc/arm,sse710-extsys.yaml | 90 +++
> >  1 file changed, 90 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml 
> > b/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> > new file mode 100644
> > index ..827ba8d962f1
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/remoteproc/arm,sse710-extsys.yaml
> > @@ -0,0 +1,90 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/remoteproc/arm,sse710-extsys.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: SSE-710 External System Remote Processor
> > +
> > +maintainers:
> > +  - Abdellatif El Khlifi 
> > +  - Hugues Kamba Mpiana 
> > +
> > +description: |
> 
> dt-preserve-formatting

Do you mean I should remove the '|' please ? (I didn't find examples of use of
dt-preserve-formatting in Documentation/devicetree/bindings/)

> 
> > +  SSE-710 is an heterogeneous subsystem supporting up to two remote
> > +  processors aka the External Systems.
> > +
> > +properties:
> > +  compatible:
> > +enum:
> > +  - arm,sse710-extsys
> > +
> > +  firmware-name:
> > +description:
> > +  The default name of the firmware to load to the remote processor.
> > +
> > +  '#extsys-id':
> 
> '#' is not correct for sure, that's not a cell specifier.
> 
> But anyway, we do not accept in general instance IDs.

I'm happy to replace the instance ID with  another solution.
In our case the remoteproc instance does not have a base address
to use. So, we can't put remoteproc@address

What do you recommend in this case please ?

Kind regards,
Abdellatif



Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-09-20 Thread Abdellatif El Khlifi
Hi Krzysztof,

> >>> +  '#extsys-id':
> >>
> >> '#' is not correct for sure, that's not a cell specifier.
> >>
> >> But anyway, we do not accept in general instance IDs.
> >
> > I'm happy to replace the instance ID with  another solution.
> > In our case the remoteproc instance does not have a base address
> > to use. So, we can't put remoteproc@address
> >
> > What do you recommend in this case please ?
> 
>  Waiting one month to respond is a great way to drop all context from my
>  memory. The emails are not even available for me - gone from inbox.
> 
>  Bus addressing could note it. Or you have different devices, so
>  different compatibles. Tricky to say, because you did not describe the
>  hardware really and it's one month later...
> 
> >>>
> >>> Sorry for waiting. I was in holidays.
> >>>
> >>> I'll add more documentation about the external system for more clarity 
> >>> [1].
> >>>
> >>> Basically, Linux runs on the Cortex-A35. The External system is a
> >>> Cortex-M core. The Cortex-A35 can not access the memory of the Cortex-M.
> >>> It can only control Cortex-M core using the reset control and status 
> >>> registers mapped
> >>> in the memory space of the Cortex-A35.
> >>
> >> That's pretty standard.
> >>
> >> It does not explain me why bus addressing or different compatible are
> >> not sufficient here.
> > 
> > Using an instance ID was a design choice.
> > I'm happy to replace it with the use of compatible and match data (WIP).
> > 
> > The match data will be pointing to a data structure containing the right 
> > offsets
> > to be used with regmap APIs.
> > 
> > syscon node is used to represent the Host Base System Control register area 
> > [1]
> > where the external system reset registers are mapped (EXT_SYS*).
> > 
> > The nodes will look like this:
> > 
> > syscon@1a01 {
> > compatible = "arm,sse710-host-base-sysctrl", "simple-mfd", "syscon";
> > reg = <0x1a01 0x1000>;
> > 
> > #address-cells = <1>;
> > #size-cells = <1>;
> > 
> > remoteproc@310 {
> > compatible = "arm,sse710-extsys0";
> > reg = <0x310 4>;
> 
> Uh, why do you create device nodes for one word? This really suggests it
> is part of parent device and your split is artificial.

The external system registers (described by the remoteproc node) are part
of the parent device (the Host Base System Control register area) described
by syscon.

In case of the external system 0 , its registers are located at offset 0x310
(physical address: 0x1a010310)

When instantiating the devices without @address, the DTC compiler
detects 2 nodes with the same name (remoteproc).

syscon@1a01 {
...

remoteproc {
compatible = "arm,sse710-extsys0";
...
}

remoteproc {
compatible = "arm,sse710-extsys1";
...
}

Cheers
Abdellatif



Re: [PATCH v2 1/5] dt-bindings: remoteproc: sse710: Add the External Systems remote processors

2024-09-19 Thread Abdellatif El Khlifi
Hi Krzysztof,

> >>> +  '#extsys-id':
> >>
> >> '#' is not correct for sure, that's not a cell specifier.
> >>
> >> But anyway, we do not accept in general instance IDs.
> > 
> > I'm happy to replace the instance ID with  another solution.
> > In our case the remoteproc instance does not have a base address
> > to use. So, we can't put remoteproc@address
> > 
> > What do you recommend in this case please ?
> 
> Waiting one month to respond is a great way to drop all context from my
> memory. The emails are not even available for me - gone from inbox.
> 
> Bus addressing could note it. Or you have different devices, so
> different compatibles. Tricky to say, because you did not describe the
> hardware really and it's one month later...
> 

Sorry for waiting. I was in holidays.

I'll add more documentation about the external system for more clarity [1].

Basically, Linux runs on the Cortex-A35. The External system is a
Cortex-M core. The Cortex-A35 can not access the memory of the Cortex-M.
It can only control Cortex-M core using the reset control and status registers 
mapped
in the memory space of the Cortex-A35.

I'll make sure this explanation is added to the binding and commit log for
more clarity.

Thanks for the suggestion regarding supporting multiple instances of the
External system. I will send a new version shortly addressing all comments.

[1]: paragraph 2.3, https://developer.arm.com/documentation/dai0550/D/?lang=en

Kind regards
Abdellatif



Detecting I/O error and Halting System : come back

2006-12-07 Thread zine el abidine Hamid
Hi evrybody,

I come back with my problem of "I/O error" (refer to
the following link to reffresh your mind :
http://groups.google.fr/group/linux.kernel/browse_thread/thread/386b69ca8389cda0/a58d753bf87c4f06?lnk=st&q=hamid+ZINE+EL+ABIDINE&rnum=2&hl=fr#a58d753bf87c4f06
)


I come back with a last question and I swear that I'll
stop annoying you with this problem.
Can you explain me why it seems that some part of the
hard drive are read only and others seems totaly
disepeared?
Some commands are available and works fine, others
generate an error like this : 

-bash: /bin/some_command: Input/output error

Can you explain that? 
Can I consider that the commands that are available
are in fact in the memory/cache and not read from the
hard drive?

Executing "e2fsck" turns sometimes in short : I got
"Bus error".
When I execute again "e2fsck" I optain this errors :
Error reading block  ( Attempt to read block from
filesystem resulted in short read). Ignore error?
...
...
Inode XYZW (...) has bad mode (00).
Clear?
...



When I try to write to a file (that I can read and
that is not empty), I loose it ?(size = 0, and the
content seems to be gone away...)

Why, when I execute the shutdown commande, I got a bus
error?




Some Linux users pointed a journal commit problem;
what do you think about?
( see the link :
http://groups.google.fr/group/comp.os.linux.misc/browse_thread/thread/7b51af5a197d4ff3/facf167c3354a06f?lnk=st&q=bash+I%2FO+error&rnum=7&hl=fr#facf167c3354a06f
)


I googled and found that there are lot off similar
cases :
http://groups.google.fr/group/linux.redhat/browse_frm/thread/35cbedb6667755ed/64dc7232177c7dc8?lnk=st&q=bash%3A+Input%2Foutput+error&rnum=25&hl=fr#64dc7232177c7dc8
http://groups.google.fr/group/fa.linux.kernel/browse_frm/thread/87546535d17c674b/030caa62ad099af9?lnk=st&q=bash%3A+%2Fusr%2Fbin%2F%3A+Input%2Foutput+error&rnum=6&hl=fr#030caa62ad099af9
http://groups.google.fr/group/fa.linux.kernel/browse_frm/thread/60d92e4bb6a5f4db/18e69ef0d46448ac?lnk=st&q=bash%3A+%2Fusr%2Fbin%2F%3A+Input%2Foutput+error&rnum=10&hl=fr#18e69ef0d46448ac
http://groups.google.fr/group/comp.os.linux.misc/browse_frm/thread/a006864077740438/f08d570d516657c1?lnk=st&q=bash%3A+%2Fusr%2Fbin%2F%3A+Input%2Foutput+error&rnum=5&hl=fr#f08d570d516657c1
http://groups.google.fr/group/fa.linux.kernel/browse_frm/thread/60d92e4bb6a5f4db/18e69ef0d46448ac?lnk=st&q=bash%3A+%2Fusr%2Fbin%2F%3A+Input%2Foutput+error&rnum=10&hl=fr#18e69ef0d46448ac
http://groups.google.fr/group/alt.e-smith.fr/browse_frm/thread/44176232ffc3c1a6/5b35e4771b8030a7?lnk=st&q=bash%3A+%2Fusr%2Fbin%2F%3A+Input%2Foutput+error&rnum=63&hl=fr#5b35e4771b8030a7
http://groups.google.fr/group/comp.os.linux.misc/browse_frm/thread/137246f4bf738e2/b26503666b86972d?lnk=st&q=bash%3A+Input%2Foutput+error&rnum=3&hl=fr#b26503666b86972d
http://groups.google.fr/group/comp.os.linux.misc/browse_frm/thread/a006864077740438/f08d570d516657c1?lnk=st&q=bash%3A+Input%2Foutput+error&rnum=13&hl=fr#f08d570d516657c1
http://groups.google.fr/group/linux.gentoo.user/browse_frm/thread/678951cec2655493/54ed838422331550?lnk=st&q=bash%3A+Input%2Foutput+error&rnum=35&hl=fr#54ed838422331550







___ 
Nouveau : téléphonez moins cher avec Yahoo! Messenger ! Découvez les tarifs 
exceptionnels pour appeler la France et l'international.
Téléchargez sur http://fr.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] SelfTest: KVM: Drop Asserts for madvise failures

2018-11-15 Thread Ahmed Abd El Mawgood
From: Ahmed Abd El Mawgood 

madvise() returns -1 without CONFIG_TRANSPARENT_HUGEPAGE=y. That would
trigger asserts when checking for return value of madvice. Following
similar decision to [1]. I thought it is ok to assume that madvise()
failures implies that THP is not supported by host kernel.

Other options were to check for Transparent Huge Page support in
/sys/kernel/mm/transparent_hugepage/enabled.

-- links --
[1] https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg04514.html

Signed-off-by: Ahmed Abd El Mawgood 
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c 
b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1b41e71283d5..7725cfdf1b79 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -586,14 +586,12 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
 src_type == VM_MEM_SRC_ANONYMOUS_THP ?  
huge_page_size : 1);
 
/* As needed perform madvise */
-   if (src_type == VM_MEM_SRC_ANONYMOUS || src_type == 
VM_MEM_SRC_ANONYMOUS_THP) {
-   ret = madvise(region->host_mem, npages * vm->page_size,
-src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE 
: MADV_HUGEPAGE);
-   TEST_ASSERT(ret == 0, "madvise failed,\n"
-   "  addr: %p\n"
-   "  length: 0x%lx\n"
-   "  src_type: %x",
-   region->host_mem, npages * vm->page_size, src_type);
+   if (src_type == VM_MEM_SRC_ANONYMOUS) {
+   madvise(region->host_mem, npages * vm->page_size,
+   MADV_NOHUGEPAGE);
+   } else if (src_type == VM_MEM_SRC_ANONYMOUS_THP) {
+   madvise(region->host_mem, npages * vm->page_size,
+   MADV_HUGEPAGE);
}
 
region->unused_phy_pages = sparsebit_alloc();
-- 
2.18.1



[PATCH V2] SelfTest: KVM: Drop Asserts for madvise MADV_NOHUGEPAGE failure

2018-11-16 Thread Ahmed Abd El Mawgood
From: Ahmed Abd El Mawgood 

madvise() returns -1 without CONFIG_TRANSPARENT_HUGEPAGE=y. That would
trigger asserts when checking for return value of madvice. Following
similar decision to [1]. I thought it is ok to assume that madvise()
MADV_NOHUGEPAGE failures implies that THP is not supported by host kernel.

Other options was to check for Transparent Huge Page support in
/sys/kernel/mm/transparent_hugepage/enabled.

-- links --
[1] https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg04514.html

Signed-off-by: Ahmed Abd El Mawgood 
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c 
b/tools/testing/selftests/kvm/lib/kvm_util.c
index 1b41e71283d5..437c5bb48061 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -586,14 +586,23 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
 src_type == VM_MEM_SRC_ANONYMOUS_THP ?  
huge_page_size : 1);
 
/* As needed perform madvise */
-   if (src_type == VM_MEM_SRC_ANONYMOUS || src_type == 
VM_MEM_SRC_ANONYMOUS_THP) {
+   if (src_type == VM_MEM_SRC_ANONYMOUS) {
+   /*
+* Neglect madvise error because it is ok to not have THP
+* support in this case.
+*/
+   madvise(region->host_mem, npages * vm->page_size,
+   MADV_NOHUGEPAGE);
+   } else if (src_type == VM_MEM_SRC_ANONYMOUS_THP) {
ret = madvise(region->host_mem, npages * vm->page_size,
-src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE 
: MADV_HUGEPAGE);
+   MADV_HUGEPAGE);
TEST_ASSERT(ret == 0, "madvise failed,\n"
-   "  addr: %p\n"
-   "  length: 0x%lx\n"
-   "  src_type: %x",
-   region->host_mem, npages * vm->page_size, src_type);
+   "Does the kernel have CONFIG_TRANSPARENT_HUGEPAGE=y\n"
+   "  addr: %p\n"
+   "  length: 0x%lx\n"
+   "  src_type: %x\n",
+   region->host_mem, npages * vm->page_size,
+   src_type);
}
 
region->unused_phy_pages = sparsebit_alloc();
-- 
2.18.1



[PATCH 02/10] KVM: X86: Add arbitrary data pointer in kvm memslot iterator functions

2018-12-07 Thread Ahmed Abd El Mawgood
This will help sharing data into the slot_level_handler callback. In my
case I need to a share a counter for the pages traversed to use it in some
bitmap. Being able to send arbitrary memory pointer into the
slot_level_handler callback made it easy.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 65 ++
 1 file changed, 37 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 7c03c0f354..b67d743c33 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1492,7 +1492,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 
 static bool __rmap_write_protect(struct kvm *kvm,
 struct kvm_rmap_head *rmap_head,
-bool pt_protect)
+bool pt_protect, void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1531,7 +1531,8 @@ static bool wrprot_ad_disabled_spte(u64 *sptep)
  * - W bit on ad-disabled SPTEs.
  * Returns true iff any D or W bits were cleared.
  */
-static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1557,7 +1558,8 @@ static bool spte_set_dirty(u64 *sptep)
return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1589,7 +1591,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false);
+   __rmap_write_protect(kvm, rmap_head, false, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1615,7 +1617,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_clear_dirty(kvm, rmap_head);
+   __rmap_clear_dirty(kvm, rmap_head, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1668,7 +1670,8 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 
for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
rmap_head = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(kvm, rmap_head, true);
+   write_protected |= __rmap_write_protect(kvm, rmap_head, true,
+   NULL);
}
 
return write_protected;
@@ -1682,7 +1685,8 @@ static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 
gfn)
return kvm_mmu_slot_gfn_write_protect(vcpu->kvm, slot, gfn);
 }
 
-static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1702,7 +1706,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct 
kvm_rmap_head *rmap_head,
   struct kvm_memory_slot *slot, gfn_t gfn, int level,
   unsigned long data)
 {
-   return kvm_zap_rmapp(kvm, rmap_head);
+   return kvm_zap_rmapp(kvm, rmap_head, NULL);
 }
 
 static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
@@ -5514,13 +5518,15 @@ void kvm_mmu_uninit_vm(struct kvm *kvm)
 }
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head 
*rmap_head);
+typedef bool (*slot_level_handler) (struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head, void *data);
 
 /* The caller should hold mmu-lock before calling this function. */
 static __always_inline bool
 slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
slot_level_handler fn, int start_level, int end_level,
-   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb)
+   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb,
+   void *data)
 {
struct slot_rmap_walk_iterator iterator;
bool flush = false;
@@ -5528,7 +5534,7 @@ slot_handle_level_range(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn,
  

[PATCH 01/10] KVM: State whether memory should be freed in kvm_free_memslot

2018-12-07 Thread Ahmed Abd El Mawgood
The conditions upon which kvm_free_memslot are kind of ad-hock,
it will be hard to extend memslot with allocatable data that needs to be
freed, so I replaced the current mechanism by clear flag that states if
the memory slot should be freed.

Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2679e476b6..039c1ef9a7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -549,9 +549,10 @@ static void kvm_destroy_dirty_bitmap(struct 
kvm_memory_slot *memslot)
  * Free any memory in @free but not in @dont.
  */
 static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free,
- struct kvm_memory_slot *dont)
+ struct kvm_memory_slot *dont,
+ enum kvm_mr_change change)
 {
-   if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
+   if (change == KVM_MR_DELETE)
kvm_destroy_dirty_bitmap(free);
 
kvm_arch_free_memslot(kvm, free, dont);
@@ -567,7 +568,7 @@ static void kvm_free_memslots(struct kvm *kvm, struct 
kvm_memslots *slots)
return;
 
kvm_for_each_memslot(memslot, slots)
-   kvm_free_memslot(kvm, memslot, NULL);
+   kvm_free_memslot(kvm, memslot, NULL, KVM_MR_DELETE);
 
kvfree(slots);
 }
@@ -1063,14 +1064,14 @@ int __kvm_set_memory_region(struct kvm *kvm,
 
kvm_arch_commit_memory_region(kvm, mem, &old, &new, change);
 
-   kvm_free_memslot(kvm, &old, &new);
+   kvm_free_memslot(kvm, &old, &new, change);
kvfree(old_memslots);
return 0;
 
 out_slots:
kvfree(slots);
 out_free:
-   kvm_free_memslot(kvm, &new, &old);
+   kvm_free_memslot(kvm, &new, &old, change);
 out:
return r;
 }
-- 
2.19.2



[PATCH 05/10] KVM: Create architecture independent ROE skeleton

2018-12-07 Thread Ahmed Abd El Mawgood
This patch introduces a hypercall that can assist against subset of kernel
rootkits, it works by place readonly protection in shadow PTE. The end
result protection is also kept in a bitmap for each kvm_memory_slot and is
used as reference when updating SPTEs. The whole goal is to protect the
guest kernel static data from modification if attacker is running from
guest ring 0, for this reason there is no hypercall to revert effect of
Memory ROE hypercall. This patch doesn't implement integrity check on guest
TLB so obvious attack on the current implementation will involve guest
virtual address -> guest physical address remapping, but there are plans to
fix that. For this patch to work on a given arch/ one would need to
implement 2 function that are architecture specific:
kvm_roe_arch_commit_protection() and kvm_roe_arch_is_userspace(). Also it
would need to have kvm_roe invoked using the appropriate hypercall
mechanism.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/kvm/roe.h |  16 
 include/linux/kvm_host.h  |   1 +
 include/uapi/linux/kvm_para.h |   4 +
 virt/kvm/kvm_main.c   |  19 +++--
 virt/kvm/roe.c| 136 ++
 virt/kvm/roe_generic.h|  19 +
 6 files changed, 190 insertions(+), 5 deletions(-)
 create mode 100644 include/kvm/roe.h
 create mode 100644 virt/kvm/roe.c
 create mode 100644 virt/kvm/roe_generic.h

diff --git a/include/kvm/roe.h b/include/kvm/roe.h
new file mode 100644
index 00..6a86866623
--- /dev/null
+++ b/include/kvm/roe.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __KVM_ROE_H__
+#define __KVM_ROE_H__
+/*
+ * KVM Read Only Enforcement
+ * Copyright (c) 2018 Ahmed Abd El Mawgood
+ *
+ * Author Ahmed Abd El Mawgood 
+ *
+ */
+void kvm_roe_arch_commit_protection(struct kvm *kvm,
+   struct kvm_memory_slot *slot);
+int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3);
+bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu);
+#endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c926698040..0baea5afcd 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -297,6 +297,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
+   unsigned long *roe_bitmap;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 6c0ce49931..e6004e0750 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -28,7 +28,11 @@
 #define KVM_HC_MIPS_CONSOLE_OUTPUT 8
 #define KVM_HC_CLOCK_PAIRING   9
 #define KVM_HC_SEND_IPI10
+#define KVM_HC_ROE 11
 
+/* ROE Functionality parameters */
+#define ROE_VERSION0
+#define ROE_MPROTECT   1
 /*
  * hypercalls use architecture specific
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 039c1ef9a7..814ee0fd35 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -61,6 +61,7 @@
 #include "coalesced_mmio.h"
 #include "async_pf.h"
 #include "vfio.h"
+#include "roe_generic.h"
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -552,9 +553,10 @@ static void kvm_free_memslot(struct kvm *kvm, struct 
kvm_memory_slot *free,
  struct kvm_memory_slot *dont,
  enum kvm_mr_change change)
 {
-   if (change == KVM_MR_DELETE)
+   if (change == KVM_MR_DELETE) {
+   kvm_roe_free(free);
kvm_destroy_dirty_bitmap(free);
-
+   }
kvm_arch_free_memslot(kvm, free, dont);
 
free->npages = 0;
@@ -1020,6 +1022,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
if (kvm_create_dirty_bitmap(&new) < 0)
goto out_free;
}
+   if (kvm_roe_init(&new) < 0)
+   goto out_free;
 
slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
if (!slots)
@@ -1273,13 +1277,18 @@ static bool memslot_is_readonly(struct kvm_memory_slot 
*slot)
return slot->flags & KVM_MEM_READONLY;
 }
 
+static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot);
+}
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
   gfn_t *nr_pages, bool write)
 {
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
return KVM_HVA_ERR_BAD;
 
-   if (memslot_is_readonly(slot) && write)
+   if (gfn_is_readonly(slot, gfn) && write)
return KVM_HVA_ERR_RO_BAD;
 
if (nr_pages)
@@ -1327,7 +1336,7 @@ unsigned long gfn_to_hva_mems

[PATCH 04/10] KVM: Document Memory ROE

2018-12-07 Thread Ahmed Abd El Mawgood
ROE version documented here is implemented in the next 2 patches
Signed-off-by: Ahmed Abd El Mawgood 
---
 Documentation/virtual/kvm/hypercalls.txt | 40 
 1 file changed, 40 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt 
b/Documentation/virtual/kvm/hypercalls.txt
index da24c138c8..a31f316ce6 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -141,3 +141,43 @@ a0 corresponds to the APIC ID in the third argument (a2), 
bit 1
 corresponds to the APIC ID a2+1, and so on.
 
 Returns the number of CPUs to which the IPIs were delivered successfully.
+
+7. KVM_HC_ROE
+
+Architecture: x86
+Status: active
+Purpose: Hypercall used to apply Read-Only Enforcement to guest memory and
+registers
+Usage 1:
+ a0: ROE_VERSION
+
+Returns non-signed number that represents the current version of ROE
+implementation current version.
+
+Usage 2:
+
+ a0: ROE_MPROTECT  (requires version >= 1)
+ a1: Start address aligned to page boundary.
+ a2: Number of pages to be protected.
+
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only.  This action is irreversible.
+Upon successful run, the number of pages protected is returned.
+
+Usage 3:
+ a0: ROE_MPROTECT_CHUNK(requires version >= 2)
+ a1: Start address aligned to page boundary.
+ a2: Number of bytes to be protected.
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only with bytes granularity. ROE_MPROTECT_CHUNK is
+relatively slow compared to ROE_MPROTECT. This action is irreversible.
+Upon successful run, the number of bytes protected is returned.
+
+Error codes:
+   -KVM_ENOSYS: system call being triggered from ring 3 or it is not
+   implemented.
+   -EINVAL: error based on given parameters.
+
+Notes: KVM_HC_ROE can not be triggered from guest Ring 3 (user mode). The
+reason is that user mode malicious software can make use of it to enforce read
+only protection on an arbitrary memory page thus crashing the kernel.
-- 
2.19.2



[PATCH V7 0/10] KVM: X86: Introducing ROE Protection Kernel Hardening

2018-12-07 Thread Ahmed Abd El Mawgood
oo host with Ubuntu guest and Qemu from git after applying
the following changes to Qemu

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 4880a05399..57d0973aca 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2035,6 +2035,9 @@ int kvm_cpu_exec(CPUState *cpu)
  run->mmio.is_write);
 ret = 0;
 break;
+   case KVM_EXIT_ROE:
+   ret = 0;
+   break;
 case KVM_EXIT_IRQ_WINDOW_OPEN:
 DPRINTF("irq_window_open\n");
 ret = EXCP_INTERRUPT;
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index f11a7eb49c..67aded8f00 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -235,7 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
-
+#define KVM_EXIT_ROE  28
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
 #define KVM_INTERNAL_ERROR_EMULATION   1



-- Change log V6 -> V7 --

- Completely remove CONFIG_KVM_ROE, ROE is always enabled, since it is opt in
  anyway.
- Bug fixes regarding how each element in the protection bitmap maps to the
  equivalent SPTE.
- General Code cleaning.


-- Known Issues --

- THP is not supported yet. In general it is not supported when the guest frame
  size is not the same as the equivalent EPT frame size.

The previous version (V6) of the patch set can be found at [1]

-- links --

[1] https://lkml.org/lkml/2018/11/4/417

-- List of patches --

[PATCH V7 01/10] KVM: State whether memory should be freed in
[PATCH V7 02/10] KVM: X86: Add arbitrary data pointer in kvm memslot
[PATCH V7 03/10] KVM: X86: Add helper function to convert SPTE to GFN
[PATCH V7 04/10] KVM: Document Memory ROE
[PATCH V7 05/10] KVM: Create architecture independent ROE skeleton
[PATCH V7 06/10] KVM: X86: Enable ROE for x86
[PATCH V7 07/10] KVM: Add support for byte granular memory ROE
[PATCH V7 08/10] KVM: X86: Port ROE_MPROTECT_CHUNK to x86
[PATCH V7 09/10] KVM: Add new exit reason For ROE violations
[PATCH V7 10/10] KVM: Log ROE violations in system log


-- Difstat --

 Documentation/virtual/kvm/hypercalls.txt |  40 
 arch/x86/include/asm/kvm_host.h  |   2 +-
 arch/x86/kvm/Makefile|   4 +-
 arch/x86/kvm/mmu.c   | 121 +--
 arch/x86/kvm/mmu.h   |  31 ++-
 arch/x86/kvm/roe.c   | 104 ++
 arch/x86/kvm/roe_arch.h  |  28 +++
 arch/x86/kvm/x86.c   |  21 +-
 include/kvm/roe.h|  28 +++
 include/linux/kvm_host.h |  25 +++
 include/uapi/linux/kvm.h |   2 +-
 include/uapi/linux/kvm_para.h    |   5 +
 virt/kvm/kvm_main.c  |  56 +++--
 virt/kvm/roe.c   | 342 +++
 virt/kvm/roe_generic.h   |  18 ++
 15 files changed, 732 insertions(+), 95 deletions(-)

Signed-off-by: Ahmed Abd El Mawgood 


[PATCH 06/10] KVM: X86: Enable ROE for x86

2018-12-07 Thread Ahmed Abd El Mawgood
This patch implements kvm_roe_arch_commit_protection and
kvm_roe_arch_is_userspace for x86, and invoke kvm_roe via the
appropriate vmcall.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/Makefile   |   4 +-
 arch/x86/kvm/mmu.c  |  71 +-
 arch/x86/kvm/mmu.h  |  30 +-
 arch/x86/kvm/roe.c  | 101 
 arch/x86/kvm/roe_arch.h |  28 +
 arch/x86/kvm/x86.c  |  11 ++--
 7 files changed, 183 insertions(+), 64 deletions(-)
 create mode 100644 arch/x86/kvm/roe.c
 create mode 100644 arch/x86/kvm/roe_arch.h

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fbda5a917c..e56903e6ff 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1230,7 +1230,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index dc4f2fdf5e..a8c915a326 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -9,7 +9,9 @@ CFLAGS_vmx.o := -I.
 KVM := ../../../virt/kvm
 
 kvm-y  += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
-   $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o
+  $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o \
+  $(KVM)/roe.o roe.o
+
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
 
 kvm-y  += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index a300e4acb8..fde565c8a1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -23,7 +23,7 @@
 #include "x86.h"
 #include "kvm_cache_regs.h"
 #include "cpuid.h"
-
+#include "roe_arch.h"
 #include 
 #include 
 #include 
@@ -1314,8 +1314,8 @@ static void pte_list_remove(struct kvm_rmap_head 
*rmap_head, u64 *sptep)
__pte_list_remove(sptep, rmap_head);
 }
 
-static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
-  struct kvm_memory_slot *slot)
+struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
+   struct kvm_memory_slot *slot)
 {
unsigned long idx;
 
@@ -1365,16 +1365,6 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
__pte_list_remove(spte, rmap_head);
 }
 
-/*
- * Used by the following functions to iterate through the sptes linked by a
- * rmap.  All fields are private and not assumed to be used outside.
- */
-struct rmap_iterator {
-   /* private fields */
-   struct pte_list_desc *desc; /* holds the sptep if not NULL */
-   int pos;/* index of the sptep */
-};
-
 /*
  * Iteration must be started by this function.  This should also be used after
  * removing/dropping sptes from the rmap link because in such cases the
@@ -1382,8 +1372,7 @@ struct rmap_iterator {
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
-  struct rmap_iterator *iter)
+u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, struct rmap_iterator 
*iter)
 {
u64 *sptep;
 
@@ -1409,7 +1398,7 @@ static u64 *rmap_get_first(struct kvm_rmap_head 
*rmap_head,
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_next(struct rmap_iterator *iter)
+u64 *rmap_get_next(struct rmap_iterator *iter)
 {
u64 *sptep;
 
@@ -1480,7 +1469,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
  *
  * Return true if tlb need be flushed.
  */
-static bool spte_write_protect(u64 *sptep, bool pt_protect)
+bool spte_write_protect(u64 *sptep, bool pt_protect)
 {
u64 spte = *sptep;
 
@@ -1498,8 +1487,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 }
 
 static bool __rmap_write_protect(struct kvm *kvm,
-struct kvm_rmap_head *rmap_head,
-bool pt_protect, void *data)
+   struct kvm_rmap_head *rmap_head, bool pt_protect)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1598,7 +1586,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false, NULL);
+  

[PATCH 03/10] KVM: X86: Add helper function to convert SPTE to GFN

2018-12-07 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 7 +++
 arch/x86/kvm/mmu.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b67d743c33..a300e4acb8 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1024,6 +1024,13 @@ static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page 
*sp, int index)
 
return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS));
 }
+gfn_t spte_to_gfn(u64 *spte)
+{
+   struct kvm_mmu_page *sp;
+
+   sp = page_header(__pa(spte));
+   return kvm_mmu_page_get_gfn(sp, spte - sp->spt);
+}
 
 static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn)
 {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index c7b333147c..49d7f2f002 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -211,4 +211,5 @@ void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, 
gfn_t gfn);
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
struct kvm_memory_slot *slot, u64 gfn);
 int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
+gfn_t spte_to_gfn(u64 *sptep);
 #endif
-- 
2.19.2



[PATCH 08/10] KVM: X86: Port ROE_MPROTECT_CHUNK to x86

2018-12-07 Thread Ahmed Abd El Mawgood
Apply d->memslot->partial_roe_bitmap to shadow page table entries
too.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/roe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/roe.c b/arch/x86/kvm/roe.c
index f787106be8..700f69823b 100644
--- a/arch/x86/kvm/roe.c
+++ b/arch/x86/kvm/roe.c
@@ -25,11 +25,14 @@ static bool __rmap_write_protect_roe(struct kvm *kvm,
struct rmap_iterator iter;
bool prot;
bool flush = false;
+   void *full_bmp =  memslot->roe_bitmap;
+   void *part_bmp = memslot->partial_roe_bitmap;
 
for_each_rmap_spte(rmap_head, &iter, sptep) {
int idx = spte_to_gfn(sptep) - memslot->base_gfn;
 
-   prot = !test_bit(idx, memslot->roe_bitmap) && pt_protect;
+   prot = !(test_bit(idx, full_bmp) || test_bit(idx, part_bmp));
+   prot = prot && pt_protect;
flush |= spte_write_protect(sptep, prot);
}
return flush;
-- 
2.19.2



[PATCH 07/10] KVM: Add support for byte granular memory ROE

2018-12-07 Thread Ahmed Abd El Mawgood
This patch documents and implements ROE_MPROTECT_CHUNK, a part of ROE
hypercall designed to protect regions of a memory page with byte
granularity. This feature provides a key primitive to protect against
attacks involving pages remapping.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/linux/kvm_host.h  |  24 
 include/uapi/linux/kvm_para.h |   1 +
 virt/kvm/kvm_main.c   |  24 +++-
 virt/kvm/roe.c| 212 --
 virt/kvm/roe_generic.h|   6 +
 5 files changed, 253 insertions(+), 14 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 0baea5afcd..159bef3450 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -294,10 +294,34 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
  */
 #define KVM_MEM_MAX_NR_PAGES ((1UL << 31) - 1)
 
+/*
+ * This structure is used to hold memory areas that are to be protected in a
+ * memory frame with mixed page permissions.
+ **/
+struct protected_chunk {
+   gpa_t gpa;
+   u64 size;
+   struct list_head list;
+};
+
+static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* https://stackoverflow.com/questions/325933/
+* determine-whether-two-date-ranges-overlap
+* Assuming that it works, that link ^ provides a solution that is
+* better than anything I would ever come up with.
+*/
+   return (gpa <= chunk->gpa + chunk->size - 1) &&
+   (gpa + len - 1 >= chunk->gpa);
+}
+
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
unsigned long *roe_bitmap;
+   unsigned long *partial_roe_bitmap;
+   struct list_head *prot_list;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index e6004e0750..4a84f974bc 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -33,6 +33,7 @@
 /* ROE Functionality parameters */
 #define ROE_VERSION0
 #define ROE_MPROTECT   1
+#define ROE_MPROTECT_CHUNK 2
 /*
  * hypercalls use architecture specific
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 814ee0fd35..0d129b05d5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1279,18 +1279,19 @@ static bool memslot_is_readonly(struct kvm_memory_slot 
*slot)
 
 static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-   return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot);
+   return gfn_is_full_roe(slot, gfn) ||
+  gfn_is_partial_roe(slot, gfn) ||
+  memslot_is_readonly(slot);
 }
 
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
   gfn_t *nr_pages, bool write)
 {
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
return KVM_HVA_ERR_BAD;
-
if (gfn_is_readonly(slot, gfn) && write)
return KVM_HVA_ERR_RO_BAD;
-
if (nr_pages)
*nr_pages = slot->npages - (gfn - slot->base_gfn);
 
@@ -1852,14 +1853,29 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, 
gpa_t gpa,
return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);
+static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
+   int len)
+{
+   u64 addr;
 
+   if (!slot)
+   return KVM_HVA_ERR_RO_BAD;
+   if (kvm_roe_check_range(slot, gfn, offset, len))
+   return KVM_HVA_ERR_RO_BAD;
+   if (memslot_is_readonly(slot))
+   return KVM_HVA_ERR_RO_BAD;
+   if (gfn_is_full_roe(slot, gfn))
+   return KVM_HVA_ERR_RO_BAD;
+   addr = __gfn_to_hva_many(slot, gfn, NULL, false);
+   return addr;
+}
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
int r;
unsigned long addr;
 
-   addr = gfn_to_hva_memslot(memslot, gfn);
+   addr = roe_gfn_to_hva(memslot, gfn, offset, len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 3f6eb6ede2..dfb1de314c 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -11,34 +11,89 @@
 #include 
 #include 
 #include 
+#include "roe_generic.h"
 
 int kvm_roe_init(struct kvm_memory_slot *slot)
 {
slot->roe_bitmap = kvzalloc(BITS_TO_LONGS(slot->npages) *
sizeof(unsigned long), GFP_KERNEL);
if (!slot->roe_bitmap)
-   return -ENOMEM;
+   

[PATCH 10/10] KVM: Log ROE violations in system log

2018-12-07 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c|  5 +
 virt/kvm/roe.c | 14 ++
 virt/kvm/roe_generic.h |  2 +-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c3a21d3bc8..661933053f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1870,6 +1870,7 @@ static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
addr = __gfn_to_hva_many(slot, gfn, NULL, false);
return addr;
 }
+
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
@@ -1877,6 +1878,10 @@ static int __kvm_write_guest_page(struct kvm_memory_slot 
*memslot, gfn_t gfn,
unsigned long addr;
 
addr = roe_gfn_to_hva(memslot, gfn, offset, len);
+   if (gfn_is_full_roe(memslot, gfn) ||
+   kvm_roe_check_range(memslot, gfn, offset, len))
+   kvm_warning_roe_violation((gfn << PAGE_SHIFT) + offset, data,
+   len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 6555838f0c..36b85fb303 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -76,6 +76,20 @@ void kvm_roe_free(struct kvm_memory_slot *slot)
kvfree(slot->prot_list);
 }
 
+void kvm_warning_roe_violation(u64 addr, const void *data, int len)
+{
+   int i;
+   const char *d = data;
+   char *buf = kvmalloc(len * 3 + 1, GFP_KERNEL);
+
+   for (i = 0; i < len; i++)
+   sprintf(buf+3*i, " %02x", d[i]);
+   pr_warn("ROE violation:\n");
+   pr_warn("\tAttempt to write %d bytes at address 0x%08llx\n", len, addr);
+   pr_warn("\tData: %s\n", buf);
+   kvfree(buf);
+}
+
 static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot,
gfn_t gfn, u64 npages, bool partial)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index f1ce4a8aec..8c191362cd 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,5 +14,5 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-
+void kvm_warning_roe_violation(u64 addr, const void *data, int len);
 #endif
-- 
2.19.2



[PATCH 09/10] KVM: Add new exit reason For ROE violations

2018-12-07 Thread Ahmed Abd El Mawgood
The problem is that qemu will not be able to detect ROE violations, so
one option would be create host API to tell if a given page is ROE
protected, or create ROE violation exit reason.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/x86.c   | 10 +-
 include/kvm/roe.h| 12 
 include/uapi/linux/kvm.h |  2 +-
 virt/kvm/kvm_main.c  |  1 +
 virt/kvm/roe.c   |  2 +-
 virt/kvm/roe_generic.h   |  9 +
 6 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 28475c83f9..ddd15bb1a7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5334,6 +5334,7 @@ static int emulator_read_write(struct x86_emulate_ctxt 
*ctxt,
const struct read_write_emulator_ops *ops)
 {
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+   struct kvm_memory_slot *slot;
gpa_t gpa;
int rc;
 
@@ -5375,7 +5376,14 @@ static int emulator_read_write(struct x86_emulate_ctxt 
*ctxt,
 
vcpu->run->mmio.len = min(8u, vcpu->mmio_fragments[0].len);
vcpu->run->mmio.is_write = vcpu->mmio_is_write = ops->write;
-   vcpu->run->exit_reason = KVM_EXIT_MMIO;
+   slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa >> PAGE_SHIFT);
+   if (slot && ops->write && (kvm_roe_check_range(slot, gpa>>PAGE_SHIFT,
+   gpa - (gpa & PAGE_MASK), bytes) ||
+   gfn_is_full_roe(slot, gpa>>PAGE_SHIFT)))
+   vcpu->run->exit_reason = KVM_EXIT_ROE;
+   else
+   vcpu->run->exit_reason = KVM_EXIT_MMIO;
+
vcpu->run->mmio.phys_addr = gpa;
 
return ops->read_write_exit_mmio(vcpu, gpa, val, bytes);
diff --git a/include/kvm/roe.h b/include/kvm/roe.h
index 6a86866623..3121a67753 100644
--- a/include/kvm/roe.h
+++ b/include/kvm/roe.h
@@ -13,4 +13,16 @@ void kvm_roe_arch_commit_protection(struct kvm *kvm,
struct kvm_memory_slot *slot);
 int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3);
 bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu);
+bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
+   int len);
+static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return test_bit(gfn - slot->base_gfn, slot->roe_bitmap);
+
+}
+static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap);
+}
+
 #endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2b7a652c9f..185767e512 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -235,7 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
-
+#define KVM_EXIT_ROE 28
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
 #define KVM_INTERNAL_ERROR_EMULATION   1
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0d129b05d5..c3a21d3bc8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -62,6 +62,7 @@
 #include "async_pf.h"
 #include "vfio.h"
 #include "roe_generic.h"
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index dfb1de314c..6555838f0c 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -60,7 +60,7 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t 
gfn, int offset,
return false;
return kvm_roe_protected_range(slot, gpa, len);
 }
-
+EXPORT_SYMBOL_GPL(kvm_roe_check_range);
 
 void kvm_roe_free(struct kvm_memory_slot *slot)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index ad121372f2..f1ce4a8aec 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,12 +14,5 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-   return test_bit(gfn - slot->base_gfn, slot->roe_bitmap);
-}
-static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-   return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap);
-}
+
 #endif
-- 
2.19.2



RESEND [PATCH 10/10] KVM: Log ROE violations in system log

2018-12-07 Thread Ahmed Abd El Mawgood
I am absolutely sorry, I had some modifications that I forgot to commit
before I send. so please use this one patch 10/10 instead of the last
one.

Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c|  3 ++-
 virt/kvm/roe.c | 26 ++
 virt/kvm/roe_generic.h |  3 ++-
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c3a21d3bc8..761cb7561a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1870,13 +1870,14 @@ static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
addr = __gfn_to_hva_many(slot, gfn, NULL, false);
return addr;
 }
+
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
int r;
unsigned long addr;
-
addr = roe_gfn_to_hva(memslot, gfn, offset, len);
+   kvm_roe_check_and_log(memslot, gfn, data, offset, len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 6555838f0c..01362f0fca 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -76,6 +76,32 @@ void kvm_roe_free(struct kvm_memory_slot *slot)
kvfree(slot->prot_list);
 }
 
+static void kvm_warning_roe_violation(u64 addr, const void *data, int len)
+{
+   int i;
+   const char *d = data;
+   char *buf = kvmalloc(len * 3 + 1, GFP_KERNEL);
+
+   for (i = 0; i < len; i++)
+   sprintf(buf+3*i, " %02x", d[i]);
+   pr_warn("ROE violation:\n");
+   pr_warn("\tAttempt to write %d bytes at address 0x%08llx\n", len, addr);
+   pr_warn("\tData: %s\n", buf);
+   kvfree(buf);
+}
+
+void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn,
+   const void *data, int offset, int len)
+{
+   if (!memslot)
+   return;
+   if (!gfn_is_full_roe(memslot, gfn))
+   return;
+   if (!kvm_roe_check_range(memslot, gfn, offset, len))
+   return;
+   kvm_warning_roe_violation((gfn << PAGE_SHIFT) + offset, data, len);
+}
+
 static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot,
gfn_t gfn, u64 npages, bool partial)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index f1ce4a8aec..6c5f0cf381 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,5 +14,6 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-
+void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn,
+   const void *data, int offset, int len);
 #endif
-- 
2.19.2



[PATCH V8 0/11] KVM: X86: Introducing ROE Protection Kernel Hardening

2019-01-06 Thread Ahmed Abd El Mawgood
uot;Actually this is more of an ABI demonstration\n");
pr_info("than actual use case\n");
}
module_init(hello);
module_exit(bye);

```

I tried this on Gentoo host with Ubuntu guest and Qemu from git after applying
the following changes to Qemu

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 4880a05399..57d0973aca 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2035,6 +2035,9 @@ int kvm_cpu_exec(CPUState *cpu)
  run->mmio.is_write);
 ret = 0;
 break;
+   case KVM_EXIT_ROE:
+   ret = 0;
+   break;
 case KVM_EXIT_IRQ_WINDOW_OPEN:
 DPRINTF("irq_window_open\n");
 ret = EXCP_INTERRUPT;
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index f11a7eb49c..67aded8f00 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -235,7 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
-
+#define KVM_EXIT_ROE  28
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
 #define KVM_INTERNAL_ERROR_EMULATION   1



-- Change log V7 -> V8 --

- Bug fix in patch 10, (it didn't work).
- Replacing the linked list structure used to store protected chunks with a red
  black tree. That offered huge performance improvement where the query time
  when writing to a linked list of ~2000 chunks was almost constant.


-- Known Issues --

- THP is not supported yet. In general it is not supported when the guest frame
  size is not the same as the equivalent EPT frame size.

The previous version (V7) of the patch set can be found at [1]

-- links --

[1] https://lkml.org/lkml/2018/12/7/345
[2] https://lkml.org/lkml/2018/12/21/340

-- List of patches --

[PATCH V8 01/11] KVM: State whether memory should be freed in
[PATCH V8 02/11] KVM: X86: Add arbitrary data pointer in kvm memslot
[PATCH V8 03/11] KVM: X86: Add helper function to convert SPTE to GFN
[PATCH V8 04/11] KVM: Document Memory ROE
[PATCH V8 05/11] KVM: Create architecture independent ROE skeleton
[PATCH V8 06/11] KVM: X86: Enable ROE for x86
[PATCH V8 07/11] KVM: Add support for byte granular memory ROE
[PATCH V8 08/11] KVM: X86: Port ROE_MPROTECT_CHUNK to x86
[PATCH V8 09/11] KVM: Add new exit reason For ROE violations
[PATCH V8 10/11] KVM: Log ROE violations in system log
[PATCH V8 11/11] KVM: ROE: Store protected chunks in red black tree

-- Difstat --

Documentation/virtual/kvm/hypercalls.txt |  40 +++
arch/x86/include/asm/kvm_host.h  |   2 +-
arch/x86/kvm/Makefile|   4 +-
arch/x86/kvm/mmu.c   | 121 -
arch/x86/kvm/mmu.h   |  31 ++-
arch/x86/kvm/roe.c   | 104 
arch/x86/kvm/roe_arch.h  |  28 ++
arch/x86/kvm/x86.c   |  21 +-
include/kvm/roe.h|  28 ++
include/linux/kvm_host.h |  57 
include/uapi/linux/kvm.h |   2 +-
include/uapi/linux/kvm_para.h|   5 +
virt/kvm/kvm_main.c  |  54 +++-
virt/kvm/roe.c   | 445 +++
virt/kvm/roe_generic.h   |  22 ++
15 files changed, 868 insertions(+), 96 deletions(-)


Signed-off-by: Ahmed Abd El Mawgood 


[PATCH V8 02/11] KVM: X86: Add arbitrary data pointer in kvm memslot iterator functions

2019-01-06 Thread Ahmed Abd El Mawgood
This will help sharing data into the slot_level_handler callback. In my
case I need to a share a counter for the pages traversed to use it in some
bitmap. Being able to send arbitrary memory pointer into the
slot_level_handler callback made it easy.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 65 ++
 1 file changed, 37 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ce770b4462..098df7d135 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1525,7 +1525,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 
 static bool __rmap_write_protect(struct kvm *kvm,
 struct kvm_rmap_head *rmap_head,
-bool pt_protect)
+bool pt_protect, void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1564,7 +1564,8 @@ static bool wrprot_ad_disabled_spte(u64 *sptep)
  * - W bit on ad-disabled SPTEs.
  * Returns true iff any D or W bits were cleared.
  */
-static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1590,7 +1591,8 @@ static bool spte_set_dirty(u64 *sptep)
return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1622,7 +1624,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false);
+   __rmap_write_protect(kvm, rmap_head, false, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1648,7 +1650,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_clear_dirty(kvm, rmap_head);
+   __rmap_clear_dirty(kvm, rmap_head, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1701,7 +1703,8 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 
for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
rmap_head = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(kvm, rmap_head, true);
+   write_protected |= __rmap_write_protect(kvm, rmap_head, true,
+   NULL);
}
 
return write_protected;
@@ -1715,7 +1718,8 @@ static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 
gfn)
return kvm_mmu_slot_gfn_write_protect(vcpu->kvm, slot, gfn);
 }
 
-static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1735,7 +1739,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct 
kvm_rmap_head *rmap_head,
   struct kvm_memory_slot *slot, gfn_t gfn, int level,
   unsigned long data)
 {
-   return kvm_zap_rmapp(kvm, rmap_head);
+   return kvm_zap_rmapp(kvm, rmap_head, NULL);
 }
 
 static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
@@ -5552,13 +5556,15 @@ void kvm_mmu_uninit_vm(struct kvm *kvm)
 }
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head 
*rmap_head);
+typedef bool (*slot_level_handler) (struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head, void *data);
 
 /* The caller should hold mmu-lock before calling this function. */
 static __always_inline bool
 slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
slot_level_handler fn, int start_level, int end_level,
-   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb)
+   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb,
+   void *data)
 {
struct slot_rmap_walk_iterator iterator;
bool flush = false;
@@ -5566,7 +5572,7 @@ slot_handle_level_range(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn,
  

[PATCH V8 01/11] KVM: State whether memory should be freed in kvm_free_memslot

2019-01-06 Thread Ahmed Abd El Mawgood
The conditions upon which kvm_free_memslot are kind of ad-hock,
it will be hard to extend memslot with allocatable data that needs to be
freed, so I replaced the current mechanism by clear flag that states if
the memory slot should be freed.

Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1f888a103f..2f37b4b6a2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -548,9 +548,10 @@ static void kvm_destroy_dirty_bitmap(struct 
kvm_memory_slot *memslot)
  * Free any memory in @free but not in @dont.
  */
 static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free,
- struct kvm_memory_slot *dont)
+ struct kvm_memory_slot *dont,
+ enum kvm_mr_change change)
 {
-   if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
+   if (change == KVM_MR_DELETE)
kvm_destroy_dirty_bitmap(free);
 
kvm_arch_free_memslot(kvm, free, dont);
@@ -566,7 +567,7 @@ static void kvm_free_memslots(struct kvm *kvm, struct 
kvm_memslots *slots)
return;
 
kvm_for_each_memslot(memslot, slots)
-   kvm_free_memslot(kvm, memslot, NULL);
+   kvm_free_memslot(kvm, memslot, NULL, KVM_MR_DELETE);
 
kvfree(slots);
 }
@@ -1061,14 +1062,14 @@ int __kvm_set_memory_region(struct kvm *kvm,
 
kvm_arch_commit_memory_region(kvm, mem, &old, &new, change);
 
-   kvm_free_memslot(kvm, &old, &new);
+   kvm_free_memslot(kvm, &old, &new, change);
kvfree(old_memslots);
return 0;
 
 out_slots:
kvfree(slots);
 out_free:
-   kvm_free_memslot(kvm, &new, &old);
+   kvm_free_memslot(kvm, &new, &old, change);
 out:
return r;
 }
-- 
2.19.2



[PATCH V8 03/11] KVM: X86: Add helper function to convert SPTE to GFN

2019-01-06 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 7 +++
 arch/x86/kvm/mmu.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 098df7d135..bbfe3f2863 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1053,6 +1053,13 @@ static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page 
*sp, int index)
 
return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS));
 }
+gfn_t spte_to_gfn(u64 *spte)
+{
+   struct kvm_mmu_page *sp;
+
+   sp = page_header(__pa(spte));
+   return kvm_mmu_page_get_gfn(sp, spte - sp->spt);
+}
 
 static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn)
 {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index c7b333147c..49d7f2f002 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -211,4 +211,5 @@ void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, 
gfn_t gfn);
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
struct kvm_memory_slot *slot, u64 gfn);
 int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
+gfn_t spte_to_gfn(u64 *sptep);
 #endif
-- 
2.19.2



[PATCH V8 05/11] KVM: Create architecture independent ROE skeleton

2019-01-06 Thread Ahmed Abd El Mawgood
This patch introduces a hypercall that can assist against subset of kernel
rootkits, it works by place readonly protection in shadow PTE. The end
result protection is also kept in a bitmap for each kvm_memory_slot and is
used as reference when updating SPTEs. The whole goal is to protect the
guest kernel static data from modification if attacker is running from
guest ring 0, for this reason there is no hypercall to revert effect of
Memory ROE hypercall. This patch doesn't implement integrity check on guest
TLB so obvious attack on the current implementation will involve guest
virtual address -> guest physical address remapping, but there are plans to
fix that. For this patch to work on a given arch/ one would need to
implement 2 function that are architecture specific:
kvm_roe_arch_commit_protection() and kvm_roe_arch_is_userspace(). Also it
would need to have kvm_roe invoked using the appropriate hypercall
mechanism.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/kvm/roe.h |  16 
 include/linux/kvm_host.h  |   1 +
 include/uapi/linux/kvm_para.h |   4 +
 virt/kvm/kvm_main.c   |  19 +++--
 virt/kvm/roe.c| 136 ++
 virt/kvm/roe_generic.h|  19 +
 6 files changed, 190 insertions(+), 5 deletions(-)
 create mode 100644 include/kvm/roe.h
 create mode 100644 virt/kvm/roe.c
 create mode 100644 virt/kvm/roe_generic.h

diff --git a/include/kvm/roe.h b/include/kvm/roe.h
new file mode 100644
index 00..6a86866623
--- /dev/null
+++ b/include/kvm/roe.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __KVM_ROE_H__
+#define __KVM_ROE_H__
+/*
+ * KVM Read Only Enforcement
+ * Copyright (c) 2018 Ahmed Abd El Mawgood
+ *
+ * Author Ahmed Abd El Mawgood 
+ *
+ */
+void kvm_roe_arch_commit_protection(struct kvm *kvm,
+   struct kvm_memory_slot *slot);
+int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3);
+bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu);
+#endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c38cc5eb7e..a627c6e81a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -297,6 +297,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
+   unsigned long *roe_bitmap;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 6c0ce49931..e6004e0750 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -28,7 +28,11 @@
 #define KVM_HC_MIPS_CONSOLE_OUTPUT 8
 #define KVM_HC_CLOCK_PAIRING   9
 #define KVM_HC_SEND_IPI10
+#define KVM_HC_ROE 11
 
+/* ROE Functionality parameters */
+#define ROE_VERSION0
+#define ROE_MPROTECT   1
 /*
  * hypercalls use architecture specific
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2f37b4b6a2..88b5fbcbb0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -61,6 +61,7 @@
 #include "coalesced_mmio.h"
 #include "async_pf.h"
 #include "vfio.h"
+#include "roe_generic.h"
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -551,9 +552,10 @@ static void kvm_free_memslot(struct kvm *kvm, struct 
kvm_memory_slot *free,
  struct kvm_memory_slot *dont,
  enum kvm_mr_change change)
 {
-   if (change == KVM_MR_DELETE)
+   if (change == KVM_MR_DELETE) {
+   kvm_roe_free(free);
kvm_destroy_dirty_bitmap(free);
-
+   }
kvm_arch_free_memslot(kvm, free, dont);
 
free->npages = 0;
@@ -1018,6 +1020,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
if (kvm_create_dirty_bitmap(&new) < 0)
goto out_free;
}
+   if (kvm_roe_init(&new) < 0)
+   goto out_free;
 
slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
if (!slots)
@@ -1348,13 +1352,18 @@ static bool memslot_is_readonly(struct kvm_memory_slot 
*slot)
return slot->flags & KVM_MEM_READONLY;
 }
 
+static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot);
+}
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
   gfn_t *nr_pages, bool write)
 {
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
return KVM_HVA_ERR_BAD;
 
-   if (memslot_is_readonly(slot) && write)
+   if (gfn_is_readonly(slot, gfn) && write)
return KVM_HVA_ERR_RO_BAD;
 
if (nr_pages)
@@ -1402,7 +1411,7 @@ unsigned long gfn_to_hva_mems

[PATCH V8 07/11] KVM: Add support for byte granular memory ROE

2019-01-06 Thread Ahmed Abd El Mawgood
This patch documents and implements ROE_MPROTECT_CHUNK, a part of ROE
hypercall designed to protect regions of a memory page with byte
granularity. This feature provides a key primitive to protect against
attacks involving pages remapping.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/linux/kvm_host.h  |  24 
 include/uapi/linux/kvm_para.h |   1 +
 virt/kvm/kvm_main.c   |  24 +++-
 virt/kvm/roe.c| 212 --
 virt/kvm/roe_generic.h|   6 +
 5 files changed, 253 insertions(+), 14 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a627c6e81a..9acf5f54ac 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -294,10 +294,34 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
  */
 #define KVM_MEM_MAX_NR_PAGES ((1UL << 31) - 1)
 
+/*
+ * This structure is used to hold memory areas that are to be protected in a
+ * memory frame with mixed page permissions.
+ **/
+struct protected_chunk {
+   gpa_t gpa;
+   u64 size;
+   struct list_head list;
+};
+
+static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* https://stackoverflow.com/questions/325933/
+* determine-whether-two-date-ranges-overlap
+* Assuming that it works, that link ^ provides a solution that is
+* better than anything I would ever come up with.
+*/
+   return (gpa <= chunk->gpa + chunk->size - 1) &&
+   (gpa + len - 1 >= chunk->gpa);
+}
+
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
unsigned long *roe_bitmap;
+   unsigned long *partial_roe_bitmap;
+   struct list_head *prot_list;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index e6004e0750..4a84f974bc 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -33,6 +33,7 @@
 /* ROE Functionality parameters */
 #define ROE_VERSION0
 #define ROE_MPROTECT   1
+#define ROE_MPROTECT_CHUNK 2
 /*
  * hypercalls use architecture specific
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 88b5fbcbb0..819033f475 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1354,18 +1354,19 @@ static bool memslot_is_readonly(struct kvm_memory_slot 
*slot)
 
 static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-   return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot);
+   return gfn_is_full_roe(slot, gfn) ||
+  gfn_is_partial_roe(slot, gfn) ||
+  memslot_is_readonly(slot);
 }
 
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
   gfn_t *nr_pages, bool write)
 {
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
return KVM_HVA_ERR_BAD;
-
if (gfn_is_readonly(slot, gfn) && write)
return KVM_HVA_ERR_RO_BAD;
-
if (nr_pages)
*nr_pages = slot->npages - (gfn - slot->base_gfn);
 
@@ -1927,14 +1928,29 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, 
gpa_t gpa,
return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);
+static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
+   int len)
+{
+   u64 addr;
 
+   if (!slot)
+   return KVM_HVA_ERR_RO_BAD;
+   if (kvm_roe_check_range(slot, gfn, offset, len))
+   return KVM_HVA_ERR_RO_BAD;
+   if (memslot_is_readonly(slot))
+   return KVM_HVA_ERR_RO_BAD;
+   if (gfn_is_full_roe(slot, gfn))
+   return KVM_HVA_ERR_RO_BAD;
+   addr = __gfn_to_hva_many(slot, gfn, NULL, false);
+   return addr;
+}
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
int r;
unsigned long addr;
 
-   addr = gfn_to_hva_memslot(memslot, gfn);
+   addr = roe_gfn_to_hva(memslot, gfn, offset, len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 33d3a4f507..4393a6a6a2 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -11,34 +11,89 @@
 #include 
 #include 
 #include 
+#include "roe_generic.h"
 
 int kvm_roe_init(struct kvm_memory_slot *slot)
 {
slot->roe_bitmap = kvzalloc(BITS_TO_LONGS(slot->npages) *
sizeof(unsigned long), GFP_KERNEL);
if (!slot->roe_bitmap)
-   return -ENOMEM;
+   

[PATCH V8 04/11] KVM: Document Memory ROE

2019-01-06 Thread Ahmed Abd El Mawgood
ROE version documented here is implemented in the next 2 patches
Signed-off-by: Ahmed Abd El Mawgood 
---
 Documentation/virtual/kvm/hypercalls.txt | 40 
 1 file changed, 40 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt 
b/Documentation/virtual/kvm/hypercalls.txt
index da24c138c8..a31f316ce6 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -141,3 +141,43 @@ a0 corresponds to the APIC ID in the third argument (a2), 
bit 1
 corresponds to the APIC ID a2+1, and so on.
 
 Returns the number of CPUs to which the IPIs were delivered successfully.
+
+7. KVM_HC_ROE
+
+Architecture: x86
+Status: active
+Purpose: Hypercall used to apply Read-Only Enforcement to guest memory and
+registers
+Usage 1:
+ a0: ROE_VERSION
+
+Returns non-signed number that represents the current version of ROE
+implementation current version.
+
+Usage 2:
+
+ a0: ROE_MPROTECT  (requires version >= 1)
+ a1: Start address aligned to page boundary.
+ a2: Number of pages to be protected.
+
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only.  This action is irreversible.
+Upon successful run, the number of pages protected is returned.
+
+Usage 3:
+ a0: ROE_MPROTECT_CHUNK(requires version >= 2)
+ a1: Start address aligned to page boundary.
+ a2: Number of bytes to be protected.
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only with bytes granularity. ROE_MPROTECT_CHUNK is
+relatively slow compared to ROE_MPROTECT. This action is irreversible.
+Upon successful run, the number of bytes protected is returned.
+
+Error codes:
+   -KVM_ENOSYS: system call being triggered from ring 3 or it is not
+   implemented.
+   -EINVAL: error based on given parameters.
+
+Notes: KVM_HC_ROE can not be triggered from guest Ring 3 (user mode). The
+reason is that user mode malicious software can make use of it to enforce read
+only protection on an arbitrary memory page thus crashing the kernel.
-- 
2.19.2



[PATCH V8 08/11] KVM: X86: Port ROE_MPROTECT_CHUNK to x86

2019-01-06 Thread Ahmed Abd El Mawgood
Apply d->memslot->partial_roe_bitmap to shadow page table entries
too.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/roe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/roe.c b/arch/x86/kvm/roe.c
index f787106be8..700f69823b 100644
--- a/arch/x86/kvm/roe.c
+++ b/arch/x86/kvm/roe.c
@@ -25,11 +25,14 @@ static bool __rmap_write_protect_roe(struct kvm *kvm,
struct rmap_iterator iter;
bool prot;
bool flush = false;
+   void *full_bmp =  memslot->roe_bitmap;
+   void *part_bmp = memslot->partial_roe_bitmap;
 
for_each_rmap_spte(rmap_head, &iter, sptep) {
int idx = spte_to_gfn(sptep) - memslot->base_gfn;
 
-   prot = !test_bit(idx, memslot->roe_bitmap) && pt_protect;
+   prot = !(test_bit(idx, full_bmp) || test_bit(idx, part_bmp));
+   prot = prot && pt_protect;
flush |= spte_write_protect(sptep, prot);
}
return flush;
-- 
2.19.2



[PATCH V8 06/11] KVM: X86: Enable ROE for x86

2019-01-06 Thread Ahmed Abd El Mawgood
This patch implements kvm_roe_arch_commit_protection and
kvm_roe_arch_is_userspace for x86, and invoke kvm_roe via the
appropriate vmcall.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/Makefile   |   4 +-
 arch/x86/kvm/mmu.c  |  71 +-
 arch/x86/kvm/mmu.h  |  30 +-
 arch/x86/kvm/roe.c  | 101 
 arch/x86/kvm/roe_arch.h |  28 +
 arch/x86/kvm/x86.c  |  11 ++--
 7 files changed, 183 insertions(+), 64 deletions(-)
 create mode 100644 arch/x86/kvm/roe.c
 create mode 100644 arch/x86/kvm/roe_arch.h

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4660ce90de..797d838c3e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1239,7 +1239,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 69b3a7c300..39f7766afe 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -9,7 +9,9 @@ CFLAGS_vmx.o := -I.
 KVM := ../../../virt/kvm
 
 kvm-y  += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
-   $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o
+  $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o \
+  $(KVM)/roe.o roe.o
+
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
 
 kvm-y  += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bbfe3f2863..2e3a43076e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -23,7 +23,7 @@
 #include "x86.h"
 #include "kvm_cache_regs.h"
 #include "cpuid.h"
-
+#include "roe_arch.h"
 #include 
 #include 
 #include 
@@ -1343,8 +1343,8 @@ static void pte_list_remove(struct kvm_rmap_head 
*rmap_head, u64 *sptep)
__pte_list_remove(sptep, rmap_head);
 }
 
-static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
-  struct kvm_memory_slot *slot)
+struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
+   struct kvm_memory_slot *slot)
 {
unsigned long idx;
 
@@ -1394,16 +1394,6 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
__pte_list_remove(spte, rmap_head);
 }
 
-/*
- * Used by the following functions to iterate through the sptes linked by a
- * rmap.  All fields are private and not assumed to be used outside.
- */
-struct rmap_iterator {
-   /* private fields */
-   struct pte_list_desc *desc; /* holds the sptep if not NULL */
-   int pos;/* index of the sptep */
-};
-
 /*
  * Iteration must be started by this function.  This should also be used after
  * removing/dropping sptes from the rmap link because in such cases the
@@ -1411,8 +1401,7 @@ struct rmap_iterator {
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
-  struct rmap_iterator *iter)
+u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, struct rmap_iterator 
*iter)
 {
u64 *sptep;
 
@@ -1438,7 +1427,7 @@ static u64 *rmap_get_first(struct kvm_rmap_head 
*rmap_head,
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_next(struct rmap_iterator *iter)
+u64 *rmap_get_next(struct rmap_iterator *iter)
 {
u64 *sptep;
 
@@ -1513,7 +1502,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
  *
  * Return true if tlb need be flushed.
  */
-static bool spte_write_protect(u64 *sptep, bool pt_protect)
+bool spte_write_protect(u64 *sptep, bool pt_protect)
 {
u64 spte = *sptep;
 
@@ -1531,8 +1520,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 }
 
 static bool __rmap_write_protect(struct kvm *kvm,
-struct kvm_rmap_head *rmap_head,
-bool pt_protect, void *data)
+   struct kvm_rmap_head *rmap_head, bool pt_protect)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1631,7 +1619,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false, NULL);
+  

[PATCH V8 09/11] KVM: Add new exit reason For ROE violations

2019-01-06 Thread Ahmed Abd El Mawgood
The problem is that qemu will not be able to detect ROE violations, so
one option would be create host API to tell if a given page is ROE
protected, or create ROE violation exit reason.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/x86.c   | 10 +-
 include/kvm/roe.h| 12 
 include/uapi/linux/kvm.h |  2 +-
 virt/kvm/kvm_main.c  |  1 +
 virt/kvm/roe.c   |  2 +-
 virt/kvm/roe_generic.h   |  9 +
 6 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 19b0f2307e..368e3d99fd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5409,6 +5409,7 @@ static int emulator_read_write(struct x86_emulate_ctxt 
*ctxt,
const struct read_write_emulator_ops *ops)
 {
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+   struct kvm_memory_slot *slot;
gpa_t gpa;
int rc;
 
@@ -5450,7 +5451,14 @@ static int emulator_read_write(struct x86_emulate_ctxt 
*ctxt,
 
vcpu->run->mmio.len = min(8u, vcpu->mmio_fragments[0].len);
vcpu->run->mmio.is_write = vcpu->mmio_is_write = ops->write;
-   vcpu->run->exit_reason = KVM_EXIT_MMIO;
+   slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa >> PAGE_SHIFT);
+   if (slot && ops->write && (kvm_roe_check_range(slot, gpa>>PAGE_SHIFT,
+   gpa - (gpa & PAGE_MASK), bytes) ||
+   gfn_is_full_roe(slot, gpa>>PAGE_SHIFT)))
+   vcpu->run->exit_reason = KVM_EXIT_ROE;
+   else
+   vcpu->run->exit_reason = KVM_EXIT_MMIO;
+
vcpu->run->mmio.phys_addr = gpa;
 
return ops->read_write_exit_mmio(vcpu, gpa, val, bytes);
diff --git a/include/kvm/roe.h b/include/kvm/roe.h
index 6a86866623..3121a67753 100644
--- a/include/kvm/roe.h
+++ b/include/kvm/roe.h
@@ -13,4 +13,16 @@ void kvm_roe_arch_commit_protection(struct kvm *kvm,
struct kvm_memory_slot *slot);
 int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3);
 bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu);
+bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
+   int len);
+static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return test_bit(gfn - slot->base_gfn, slot->roe_bitmap);
+
+}
+static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap);
+}
+
 #endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6d4ea4b6c9..0a386bb5f2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -235,7 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
-
+#define KVM_EXIT_ROE 28
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
 #define KVM_INTERNAL_ERROR_EMULATION   1
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 819033f475..d92d300539 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -62,6 +62,7 @@
 #include "async_pf.h"
 #include "vfio.h"
 #include "roe_generic.h"
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 4393a6a6a2..9540473f89 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -60,7 +60,7 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t 
gfn, int offset,
return false;
return kvm_roe_protected_range(slot, gpa, len);
 }
-
+EXPORT_SYMBOL_GPL(kvm_roe_check_range);
 
 void kvm_roe_free(struct kvm_memory_slot *slot)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index ad121372f2..f1ce4a8aec 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,12 +14,5 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-   return test_bit(gfn - slot->base_gfn, slot->roe_bitmap);
-}
-static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-   return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap);
-}
+
 #endif
-- 
2.19.2



[PATCH V8 10/11] KVM: Log ROE violations in system log

2019-01-06 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c|  3 ++-
 virt/kvm/roe.c | 25 +
 virt/kvm/roe_generic.h |  3 ++-
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d92d300539..b3dc7255b0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1945,13 +1945,14 @@ static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
addr = __gfn_to_hva_many(slot, gfn, NULL, false);
return addr;
 }
+
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
int r;
unsigned long addr;
-
addr = roe_gfn_to_hva(memslot, gfn, offset, len);
+   kvm_roe_check_and_log(memslot, gfn, data, offset, len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 9540473f89..e424b45e1c 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -76,6 +76,31 @@ void kvm_roe_free(struct kvm_memory_slot *slot)
kvfree(slot->prot_list);
 }
 
+static void kvm_warning_roe_violation(u64 addr, const void *data, int len)
+{
+   int i;
+   const char *d = data;
+   char *buf = kvmalloc(len * 3 + 1, GFP_KERNEL);
+
+   for (i = 0; i < len; i++)
+   sprintf(buf+3*i, " %02x", d[i]);
+   pr_warn("ROE violation:\n");
+   pr_warn("\tAttempt to write %d bytes at address 0x%08llx\n", len, addr);
+   pr_warn("\tData: %s\n", buf);
+   kvfree(buf);
+}
+
+void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn,
+   const void *data, int offset, int len)
+{
+   if (!memslot)
+   return;
+   if (!gfn_is_full_roe(memslot, gfn) &&
+   !kvm_roe_check_range(memslot, gfn, offset, len))
+   return;
+   kvm_warning_roe_violation((gfn << PAGE_SHIFT) + offset, data, len);
+}
+
 static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot,
gfn_t gfn, u64 npages, bool partial)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index f1ce4a8aec..6c5f0cf381 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,5 +14,6 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-
+void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn,
+   const void *data, int offset, int len);
 #endif
-- 
2.19.2



[PATCH V8 11/11] KVM: ROE: Store protected chunks in red black tree

2019-01-06 Thread Ahmed Abd El Mawgood
The old way of storing protected chunks was a linked list. That made
linear overhead when searching for chunks. When reaching 2000 chunk, The
time taken two read the last chunk was about 10 times slower than the
first chunk. This patch stores the chunks as tree for faster search.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/linux/kvm_host.h |  36 ++-
 virt/kvm/roe.c   | 228 +++
 virt/kvm/roe_generic.h   |   3 +
 3 files changed, 197 insertions(+), 70 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9acf5f54ac..5f4bec0662 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -301,7 +302,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
 struct protected_chunk {
gpa_t gpa;
u64 size;
-   struct list_head list;
+   struct rb_node node;
 };
 
 static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk,
@@ -316,12 +317,43 @@ static inline bool kvm_roe_range_overlap(struct 
protected_chunk *chunk,
(gpa + len - 1 >= chunk->gpa);
 }
 
+static inline int kvm_roe_range_cmp_position(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* returns -1 if the gpa and len are smaller than chunk.
+* returns 0 if they overlap or strictly adjacent
+* returns 1 if gpa and len are bigger than the chunk
+*/
+
+   if (gpa + len <= chunk->gpa)
+   return -1;
+   if (gpa >= chunk->gpa + chunk->size)
+   return 1;
+   return 0;
+}
+
+static inline int kvm_roe_range_cmp_mergability(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* returns -1 if the gpa and len are smaller than chunk and not adjacent
+* to it
+* returns 0 if they overlap or strictly adjacent
+* returns 1 if gpa and len are bigger than the chunk and not adjacent
+* to it
+*/
+   if (gpa + len < chunk->gpa)
+   return -1;
+   if (gpa > chunk->gpa + chunk->size)
+   return 1;
+   return 0;
+
+}
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
unsigned long *roe_bitmap;
unsigned long *partial_roe_bitmap;
-   struct list_head *prot_list;
+   struct rb_root  *prot_root;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index e424b45e1c..15297c0e57 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -23,10 +23,10 @@ int kvm_roe_init(struct kvm_memory_slot *slot)
sizeof(unsigned long), GFP_KERNEL);
if (!slot->partial_roe_bitmap)
goto fail2;
-   slot->prot_list = kvzalloc(sizeof(struct list_head), GFP_KERNEL);
-   if (!slot->prot_list)
+   slot->prot_root = kvzalloc(sizeof(struct rb_root), GFP_KERNEL);
+   if (!slot->prot_root)
goto fail3;
-   INIT_LIST_HEAD(slot->prot_list);
+   *slot->prot_root = RB_ROOT;
return 0;
 fail3:
kvfree(slot->partial_roe_bitmap);
@@ -40,12 +40,19 @@ int kvm_roe_init(struct kvm_memory_slot *slot)
 static bool kvm_roe_protected_range(struct kvm_memory_slot *slot, gpa_t gpa,
int len)
 {
-   struct list_head *pos;
-   struct protected_chunk *cur_chunk;
-
-   list_for_each(pos, slot->prot_list) {
-   cur_chunk = list_entry(pos, struct protected_chunk, list);
-   if (kvm_roe_range_overlap(cur_chunk, gpa, len))
+   struct rb_node *node = slot->prot_root->rb_node;
+
+   while (node) {
+   struct protected_chunk *cur_chunk;
+   int cmp;
+
+   cur_chunk = rb_entry(node, struct protected_chunk, node);
+   cmp = kvm_roe_range_cmp_position(cur_chunk, gpa, len);
+   if (cmp < 0)/*target chunk is before current node*/
+   node = node->rb_left;
+   else if (cmp > 0)/*target chunk is after current node*/
+   node = node->rb_right;
+   else
return true;
}
return false;
@@ -62,18 +69,24 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
 }
 EXPORT_SYMBOL_GPL(kvm_roe_check_range);
 
-void kvm_roe_free(struct kvm_memory_slot *slot)
+static void kvm_roe_destroy_tree(struct rb_node *node)
 {
-   struct protected_chunk *pos, *n;
-   struct list_head *head = slot->prot_list;
+   struct protected_chunk *cur_chunk;
+
+   if (!node)
+   return;
+   kvm_roe_destroy_tree(node->rb_left);
+   kvm_roe_destroy_tree(node->rb_right);
+   cur_chunk = rb_ent

[PATCH V6 8/8] KVM: Log ROE violations in system log

2018-11-04 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c|  7 +++
 virt/kvm/roe.c | 14 ++
 virt/kvm/roe_generic.h |  2 ++
 3 files changed, 23 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 48c5d9d9474e..d625db7f5350 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -63,6 +63,8 @@
 #include "vfio.h"
 #include "roe_generic.h"
 
+#include 
+
 #define CREATE_TRACE_POINTS
 #include 
 
@@ -1867,6 +1869,7 @@ static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
addr = __gfn_to_hva_many(slot, gfn, NULL, false);
return addr;
 }
+
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
@@ -1874,6 +1877,10 @@ static int __kvm_write_guest_page(struct kvm_memory_slot 
*memslot, gfn_t gfn,
unsigned long addr;
 
addr = roe_gfn_to_hva(memslot, gfn, offset, len);
+   if (gfn_is_full_roe(memslot, gfn) ||
+   kvm_roe_check_range(memslot, gfn, offset, len))
+   kvm_warning_roe_violation((gfn << PAGE_SHIFT) + offset, data,
+   len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index e94314fed3a3..c30c6b028638 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -76,6 +76,20 @@ void kvm_roe_free(struct kvm_memory_slot *slot)
kvfree(slot->prot_list);
 }
 
+void kvm_warning_roe_violation(u64 addr, const void *data, int len)
+{
+   int i;
+   const char *d = data;
+   char *buf = kvmalloc(len * 3 + 1, GFP_KERNEL);
+
+   for (i = 0; i < len; i++)
+   sprintf(buf+3*i, " %02x", d[i]);
+   pr_warn("ROE violation:\n");
+   pr_warn("\tAttempt to write %d bytes at address 0x%08llx\n", len, addr);
+   pr_warn("\tData: %s\n", buf);
+   kvfree(buf);
+}
+
 static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot,
gfn_t gfn, u64 npages, bool partial)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index 006fc7b52bba..bce426441468 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -11,6 +11,7 @@
  */
 #ifdef CONFIG_KVM_ROE
 
+void kvm_warning_roe_violation(u64 addr, const void *data, int len);
 void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
@@ -39,6 +40,7 @@ static bool kvm_roe_check_range(struct kvm_memory_slot *slot, 
gfn_t gfn,
 {
return false;
 }
+static void kvm_warning_roe_violation(u64 addr, const void *data, int len) {}
 #endif
 
 #endif
-- 
2.18.1



[PATCH V6 7/8] KVM: X86: Port ROE_MPROTECT_CHUNK to x86

2018-11-04 Thread Ahmed Abd El Mawgood
Apply d->memslot->partial_roe_bitmap to shadow page table entries
too.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/roe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/roe.c b/arch/x86/kvm/roe.c
index cd3e6944c15f..b2b50fbcd598 100644
--- a/arch/x86/kvm/roe.c
+++ b/arch/x86/kvm/roe.c
@@ -25,9 +25,12 @@ static bool __rmap_write_protect_roe(struct kvm *kvm,
struct rmap_iterator iter;
bool prot;
bool flush = false;
+   void *full_bmp =  d->memslot->roe_bitmap;
+   void *part_bmp = d->memslot->partial_roe_bitmap;
 
for_each_rmap_spte(rmap_head, &iter, sptep) {
-   prot = !test_bit(d->i, d->memslot->roe_bitmap) && pt_protect;
+   prot = !(test_bit(d->i, full_bmp) || test_bit(d->i, part_bmp));
+   prot = prot && pt_protect;
flush |= spte_write_protect(sptep, prot);
d->i++;
}
-- 
2.18.1



[PATCH V6 2/8] KVM: X86: Add arbitrary data pointer in kvm memslot iterator functions

2018-11-04 Thread Ahmed Abd El Mawgood
This will help sharing data into the slot_level_handler callback. In my
case I need to a share a counter for the pages traversed to use it in some
bitmap. Being able to send arbitrary memory pointer into the
slot_level_handler callback made it easy.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 65 ++
 1 file changed, 37 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index cf5f572f2305..c54ec914935b 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1492,7 +1492,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 
 static bool __rmap_write_protect(struct kvm *kvm,
 struct kvm_rmap_head *rmap_head,
-bool pt_protect)
+bool pt_protect, void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1531,7 +1531,8 @@ static bool wrprot_ad_disabled_spte(u64 *sptep)
  * - W bit on ad-disabled SPTEs.
  * Returns true iff any D or W bits were cleared.
  */
-static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1557,7 +1558,8 @@ static bool spte_set_dirty(u64 *sptep)
return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1589,7 +1591,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false);
+   __rmap_write_protect(kvm, rmap_head, false, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1615,7 +1617,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_clear_dirty(kvm, rmap_head);
+   __rmap_clear_dirty(kvm, rmap_head, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1668,7 +1670,8 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 
for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
rmap_head = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(kvm, rmap_head, true);
+   write_protected |= __rmap_write_protect(kvm, rmap_head, true,
+   NULL);
}
 
return write_protected;
@@ -1682,7 +1685,8 @@ static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 
gfn)
return kvm_mmu_slot_gfn_write_protect(vcpu->kvm, slot, gfn);
 }
 
-static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1702,7 +1706,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct 
kvm_rmap_head *rmap_head,
   struct kvm_memory_slot *slot, gfn_t gfn, int level,
   unsigned long data)
 {
-   return kvm_zap_rmapp(kvm, rmap_head);
+   return kvm_zap_rmapp(kvm, rmap_head, NULL);
 }
 
 static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
@@ -5523,13 +5527,15 @@ void kvm_mmu_uninit_vm(struct kvm *kvm)
 }
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head 
*rmap_head);
+typedef bool (*slot_level_handler) (struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head, void *data);
 
 /* The caller should hold mmu-lock before calling this function. */
 static __always_inline bool
 slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
slot_level_handler fn, int start_level, int end_level,
-   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb)
+   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb,
+   void *data)
 {
struct slot_rmap_walk_iterator iterator;
bool flush = false;
@@ -5537,7 +5543,7 @@ slot_handle_level_range(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn,
  

[PATCH V6 3/8] KVM: Document Memory ROE

2018-11-04 Thread Ahmed Abd El Mawgood
ROE version documented here is implemented in the next 2 patches

Signed-off-by: Ahmed Abd El Mawgood 
---
 Documentation/virtual/kvm/hypercalls.txt | 31 
 1 file changed, 31 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt 
b/Documentation/virtual/kvm/hypercalls.txt
index da24c138c8d1..8af64d826f03 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -141,3 +141,34 @@ a0 corresponds to the APIC ID in the third argument (a2), 
bit 1
 corresponds to the APIC ID a2+1, and so on.
 
 Returns the number of CPUs to which the IPIs were delivered successfully.
+
+7. KVM_HC_ROE
+
+Architecture: x86
+Status: active
+Purpose: Hypercall used to apply Read-Only Enforcement to guest memory and
+registers
+Usage 1:
+ a0: ROE_VERSION
+
+Returns non-signed number that represents the current version of ROE
+implementation current version.
+
+Usage 2:
+
+ a0: ROE_MPROTECT  (requires version >= 1)
+ a1: Start address aligned to page boundary.
+ a2: Number of pages to be protected.
+
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only.  This action is irreversible.
+Upon successful run, the number of pages protected is returned.
+
+Error codes:
+   -KVM_ENOSYS: system call being triggered from ring 3 or it is not
+   implemented.
+   -EINVAL: error based on given parameters.
+
+Notes: KVM_HC_ROE can not be triggered from guest Ring 3 (user mode). The
+reason is that user mode malicious software can make use of it to enforce read
+only protection on an arbitrary memory page thus crashing the kernel.
-- 
2.18.1



[PATCH V4 3/5] KVM: X86: Adding skeleton for Memory ROE

2018-10-20 Thread Ahmed Abd El Mawgood
This patch introduces a hypercall implemented for X86 that can assist
against subset of kernel rootkits, it works by place readonly protection in
shadow PTE. The end result protection is also kept in a bitmap for each
kvm_memory_slot and is used as reference when updating SPTEs. The whole
goal is to protect the guest kernel static data from modification if
attacker is running from guest ring 0, for this reason there is no
hypercall to revert effect of Memory ROE hypercall. This patch doesn't
implement integrity check on guest TLB so obvious attack on the current
implementation will involve guest virtual address -> guest physical
address remapping, but there are plans to fix that.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |  11 ++-
 arch/x86/kvm/Kconfig|   7 ++
 arch/x86/kvm/mmu.c  |  72 +---
 arch/x86/kvm/x86.c  | 143 +++-
 include/linux/kvm_host.h|   3 +
 include/uapi/linux/kvm_para.h   |   4 +
 virt/kvm/kvm_main.c |  34 +++-
 7 files changed, 255 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09b2e3e2cf1b..aa080c3e302e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -238,6 +238,15 @@ struct kvm_mmu_memory_cache {
void *objects[KVM_NR_MEM_OBJS];
 };
 
+/*
+ * This is internal structure used to be be able to access kvm memory slot and
+ * have track of the number of current PTE when doing shadow PTE walk
+ */
+struct kvm_write_access_data {
+   int i;
+   struct kvm_memory_slot *memslot;
+};
+
 /*
  * the pages used as guest page table on soft mmu are tracked by
  * kvm_memory_slot.arch.gfn_track which is 16 bits, so the role bits used
@@ -1178,7 +1187,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 1bbec387d289..2fcbb1788a24 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -96,6 +96,13 @@ config KVM_MMU_AUDIT
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 auditing of KVM MMU events at runtime.
 
+config KVM_ROE
+   bool "Hypercall Memory Read-Only Enforcement"
+   depends on KVM && X86
+   help
+   This option adds KVM_HC_ROE hypercall to kvm as a hardening
+   mechanism to protect memory pages from being edited.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index cc36abe1ee44..c54aa5287e14 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1484,9 +1484,8 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_write_protect(struct kvm *kvm,
-struct kvm_rmap_head *rmap_head,
-bool pt_protect, void *data)
+static bool __rmap_write_protection(struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head, bool pt_protect)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1498,6 +1497,38 @@ static bool __rmap_write_protect(struct kvm *kvm,
return flush;
 }
 
+#ifdef CONFIG_KVM_ROE
+static bool __rmap_write_protect_roe(struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head,
+   bool pt_protect,
+   struct kvm_write_access_data *d)
+{
+   u64 *sptep;
+   struct rmap_iterator iter;
+   bool prot;
+   bool flush = false;
+
+   for_each_rmap_spte(rmap_head, &iter, sptep) {
+   prot = !test_bit(d->i, d->memslot->roe_bitmap) && pt_protect;
+   flush |= spte_write_protect(sptep, prot);
+   d->i++;
+   }
+   return flush;
+}
+#endif
+
+static bool __rmap_write_protect(struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head,
+   bool pt_protect,
+   struct kvm_write_access_data *d)
+{
+#ifdef CONFIG_KVM_ROE
+   if (d != NULL)
+   return __rmap_write_protect_roe(kvm, rmap_head, pt_protect, d);
+#endif
+   return __rmap_write_protection(kvm, rmap_head, pt_protect);
+}
+
 static bool spte_clear_dirty(u64 *sptep)
 {
u64 spte = *sptep;
@@ -1585,7 +1616,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rm

[RFC] kvm: Adding skelaton for Memory ROE

2018-07-16 Thread Ahmed Abd El Mawgood
This is my first patch, an attempt to implement Memory ROE discussed by me
earlier as a way to prevent Rootkits. I have already explained in details
in this thread:
https://www.mail-archive.com/kernelnewbies@kernelnewbies.org/msg18826.html
So I think there is no need for saying the exact same thing again.
The problem is that the code isn't working and I can't figure out why

I tried implementing the protection to follow similar behavior to that
of KVM_MEM_READONLY but to be on page (SPTE) level
The current problem I am facing is that when handling the hypercall
vcpu->mode turns to be OUTSIDE_GUEST_MODE but KVM_REQ_TLB_FLUSH doesn't
seem to be handled correctly. KVM documentation promised that when VCPU is
not in GUEST_MODE VCPU are handled asap and kvm_vcpu_kick(vcpu); will
even force that, but it doesn't seem to be the case for me. This is the
kind of logging I am getting:

[3556.312299] kvm_mmu_slot_apply_flags: visited
[3556.312301] kvm_mmu_slot_apply_write_access: Flush = false
[3557.034243] gfn_is_readonly: test_bit = 0
[3557.034251] gfn_is_readonly: test_bit = 0
[3557.034254] gfn_is_readonly: test_bit = 0
[3557.034463] Hypercall received, page address 0x0
[3557.034466] gfn_is_readonly: test_bit = 0
[3557.034469] kvm_mroe: flush state = Done
[3557.034472] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[3557.034475] Setting page number 0 in slot number 0
[3557.034480] slot_rmap_apply_protection: The 0th page is readonly, Flush = True
[3557.034483] kvm_mmu_slot_apply_write_access: Flush = true
[3557.034486] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[3557.034488] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[3557.034490] kvm_mroe: flush state = Waiting

For some reason kvm_vcpu_kick() didn't force the KVM_REQ_TLB_FLUSH to
kick into the virtual cpu (I am talking about the last 2 lines).

I am aware that there is still alot missing (like dealing with malicious
guest remappings) and the code quality sucks, but any ideas about what I
could be doing wrong (or ideas in general) would be apprciated. I am
already planning to do everything cleanly once it works.

Thansk.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |   7 ++-
 arch/x86/kvm/Kconfig|   7 +++
 arch/x86/kvm/mmu.c  | 127 +++-
 arch/x86/kvm/x86.c  |  83 --
 include/linux/kvm_host.h|  17 ++
 include/uapi/linux/kvm_para.h   |   4 +-
 virt/kvm/kvm_main.c |  36 +---
 7 files changed, 226 insertions(+), 55 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c13cd28d9d1b..c66e9245f750 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -235,7 +235,10 @@ struct kvm_mmu_memory_cache {
int nobjs;
void *objects[KVM_NR_MEM_OBJS];
 };
-
+struct kvm_write_access_data {
+   int i;
+   struct kvm_memory_slot *memslot;
+};
 /*
  * the pages used as guest page table on soft mmu are tracked by
  * kvm_memory_slot.arch.gfn_track which is 16 bits, so the role bits used
@@ -1130,7 +1133,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 92fd433c50b9..8ae822a8dc7a 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -96,6 +96,13 @@ config KVM_MMU_AUDIT
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 auditing of KVM MMU events at runtime.
 
+config KVM_MROE
+   bool "Hypercall Memory Read-Only Enforcement"
+   depends on KVM && X86
+   help
+   This option add KVM_HC_HMROE hypercall to kvm which as hardening
+   mechanism to protect memory pages from being edited.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d594690d8b95..946545b8b8cb 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -70,7 +70,7 @@ enum {
 #undef MMU_DEBUG
 
 #ifdef MMU_DEBUG
-static bool dbg = 0;
+static bool dbg = 1;
 module_param(dbg, bool, 0644);
 
 #define pgprintk(x...) do { if (dbg) printk(x); } while (0)
@@ -1402,7 +1402,6 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
 static bool spte_write_protect(u64 *sptep, bool pt_protect)
 {
u64 spte = *sptep;
-
if (!is_writable_pte(spte) &&
  !(pt_protect && spte_can_locklessl

[RFC] kvm: Adding skelaton for Memory ROE

2018-07-16 Thread Ahmed Abd El Mawgood
This is my first patch, an attempt to implement Memory ROE discussed by me
earlier as a way to prevent Rootkits. I have already explained in details
in this thread:
https://www.mail-archive.com/kernelnewbies@kernelnewbies.org/msg18826.html
So I think there is no need for saying the exact same thing again.
The problem is that the code isn't working and I can't figure out why

I tried implementing the protection to follow similar behavior to that
of KVM_MEM_READONLY but to be on page (SPTE) level
The current problem I am facing is that when handling the hypercall
vcpu->mode turns to be OUTSIDE_GUEST_MODE but KVM_REQ_TLB_FLUSH doesn't
seem to be handled correctly. KVM documentation promised that when VCPU is
not in GUEST_MODE VCPU are handled asap and kvm_vcpu_kick(vcpu); will
even force that, but it doesn't seem to be the case for me. This is the
kind of logging I am getting:

[3556.312299] kvm_mmu_slot_apply_flags: visited
[3556.312301] kvm_mmu_slot_apply_write_access: Flush = false
[3557.034243] gfn_is_readonly: test_bit = 0
[3557.034251] gfn_is_readonly: test_bit = 0
[3557.034254] gfn_is_readonly: test_bit = 0
[3557.034463] Hypercall received, page address 0x0
[3557.034466] gfn_is_readonly: test_bit = 0
[3557.034469] kvm_mroe: flush state = Done
[3557.034472] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[3557.034475] Setting page number 0 in slot number 0
[3557.034480] slot_rmap_apply_protection: The 0th page is readonly, Flush = True
[3557.034483] kvm_mmu_slot_apply_write_access: Flush = true
[3557.034486] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[3557.034488] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[3557.034490] kvm_mroe: flush state = Waiting

For some reason kvm_vcpu_kick() didn't force the KVM_REQ_TLB_FLUSH to
kick into the virtual cpu (I am talking about the last 2 lines).

I am aware that there is still alot missing (like dealing with malicious
guest remappings) and the code quality sucks, but any ideas about what I
could be doing wrong (or ideas in general) would be apprciated. I am
already planning to do everything cleanly once it works.

Thansk.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |   7 ++-
 arch/x86/kvm/Kconfig|   7 +++
 arch/x86/kvm/mmu.c  | 127 +++-
 arch/x86/kvm/x86.c  |  83 --
 include/linux/kvm_host.h|  17 ++
 include/uapi/linux/kvm_para.h   |   4 +-
 virt/kvm/kvm_main.c |  36 +---
 7 files changed, 226 insertions(+), 55 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c13cd28d9d1b..c66e9245f750 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -235,7 +235,10 @@ struct kvm_mmu_memory_cache {
int nobjs;
void *objects[KVM_NR_MEM_OBJS];
 };
-
+struct kvm_write_access_data {
+   int i;
+   struct kvm_memory_slot *memslot;
+};
 /*
  * the pages used as guest page table on soft mmu are tracked by
  * kvm_memory_slot.arch.gfn_track which is 16 bits, so the role bits used
@@ -1130,7 +1133,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 92fd433c50b9..8ae822a8dc7a 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -96,6 +96,13 @@ config KVM_MMU_AUDIT
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 auditing of KVM MMU events at runtime.
 
+config KVM_MROE
+   bool "Hypercall Memory Read-Only Enforcement"
+   depends on KVM && X86
+   help
+   This option add KVM_HC_HMROE hypercall to kvm which as hardening
+   mechanism to protect memory pages from being edited.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d594690d8b95..946545b8b8cb 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -70,7 +70,7 @@ enum {
 #undef MMU_DEBUG
 
 #ifdef MMU_DEBUG
-static bool dbg = 0;
+static bool dbg = 1;
 module_param(dbg, bool, 0644);
 
 #define pgprintk(x...) do { if (dbg) printk(x); } while (0)
@@ -1402,7 +1402,6 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
 static bool spte_write_protect(u64 *sptep, bool pt_protect)
 {
u64 spte = *sptep;
-
if (!is_writable_pte(spte) &&
  !(pt_protect && spte_can_locklessl

[RFC V2] kvm: Adding skelaton for Memory ROE

2018-07-16 Thread Ahmed Abd El Mawgood
This is an attempt to implement Memory ROE discussed by me earlier as a
way to prevent Rootkits. I have already explained in details in this
thread:
https://www.mail-archive.com/kernelnewbies@kernelnewbies.org/msg18826.html
So I think there is no need for saying the exact same thing again.
The problem is that the code isn't working and I can't figure out why

I tried implementing the protection to follow similar behavior to that
of KVM_MEM_READONLY but to be on page (SPTE) level
The current problem I am facing is that when handling the hypercall
vcpu->mode turns to be OUTSIDE_GUEST_MODE but KVM_REQ_TLB_FLUSH doesn't
seem to be handled correctly. KVM documentation promised that when VCPU is
not in GUEST_MODE VCPU are handled asap and kvm_vcpu_kick(vcpu); will
even force that, but it doesn't seem to be the case for me. This is the
kind of logging I am getting:

[9073.753306] I came here kvm_mmu_slot_apply_flags
[9073.753311] kvm_mmu_slot_apply_write_access: Flush = false
[9073.992536] gfn_is_readonly: test_bit = 0
[9073.992543] gfn_is_readonly: test_bit = 0
[9073.992545] gfn_is_readonly: test_bit = 0
[9073.992703] Hypercall received, page address 0x0
[9073.992705] gfn_is_readonly: test_bit = 0
[9073.992708] kvm_mroe: flush state = Done
[9073.992709] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[9073.992711] Setting page number 0 in slot number 0
[9073.992715] slot_rmap_apply_protection: The 0th page is readonly, Flush = True
[9073.992717] kvm_mmu_slot_apply_write_access: Flush = true
[9073.992719] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[9073.992720] kvm_mroe: cpu mode = OUTSIDE_GUEST_MODE
[9073.992721] kvm_mroe: flush state = Waiting

For some reason kvm_vcpu_kick() didn't force the KVM_REQ_TLB_FLUSH to
kick into the virtual cpu (I am talking about the last 2 lines).

I am aware that there is still alot missing (like dealing with malicious
guest remappings) and the code quality sucks, but any ideas about what I
could be doing wrong (or ideas in general) would be apprciated. I am
already planning to do everything cleanly once it works.

Thansk.

Edits for V2:
Unfortunately I did few mistakes that lead do kernel not compiling on
valnilla .config file because of silly mistakes not doing #ifdef..#endif
in some places so this lead to using symbols only available when CONFIG_KVM_MROE
is set in .conf even if it wasn't anyway it is fixed. and I should not
that CONFIG_KVM_MROE should be used when testing my code and trying to
figure out what went wrong
Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |   7 +-
 arch/x86/kvm/Kconfig|   7 ++
 arch/x86/kvm/mmu.c  | 158 ++--
 arch/x86/kvm/x86.c  |  83 -
 include/linux/kvm_host.h|  17 +
 include/uapi/linux/kvm_para.h   |   4 +-
 virt/kvm/kvm_main.c |  36 +++--
 7 files changed, 257 insertions(+), 55 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c13cd28d9d1b..c66e9245f750 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -235,7 +235,10 @@ struct kvm_mmu_memory_cache {
int nobjs;
void *objects[KVM_NR_MEM_OBJS];
 };
-
+struct kvm_write_access_data {
+   int i;
+   struct kvm_memory_slot *memslot;
+};
 /*
  * the pages used as guest page table on soft mmu are tracked by
  * kvm_memory_slot.arch.gfn_track which is 16 bits, so the role bits used
@@ -1130,7 +1133,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 92fd433c50b9..8ae822a8dc7a 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -96,6 +96,13 @@ config KVM_MMU_AUDIT
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
 auditing of KVM MMU events at runtime.
 
+config KVM_MROE
+   bool "Hypercall Memory Read-Only Enforcement"
+   depends on KVM && X86
+   help
+   This option add KVM_HC_HMROE hypercall to kvm which as hardening
+   mechanism to protect memory pages from being edited.
+
 # OK, it's a little counter-intuitive to do this, but it puts it neatly under
 # the virtualization menu.
 source drivers/vhost/Kconfig
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d594690d8b95..e06e923f90aa 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -70,7 +70,7 @@ enum {
 #undef MMU_DEBUG
 
 #ifdef MMU_DEBUG
-static bool dbg = 0;
+static b

Hello,

2014-08-15 Thread Lina Tayeb El Safi


Hi,
How are you today? My name is Lina Tayeb El Safi. I saw your email on 
my search for a nice and trusted person so i decided to write to you. 
I will like you to write and tell me more about your Self, from there 
i will reply you with more of my details and pictures. I will be 
waiting to receive from you. Have a nice day.

Best regard.

Yours sincerely
Lina Tayeb,



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESEND PATCH V8 07/11] KVM: Add support for byte granular memory ROE

2019-01-20 Thread Ahmed Abd El Mawgood
This patch documents and implements ROE_MPROTECT_CHUNK, a part of ROE
hypercall designed to protect regions of a memory page with byte
granularity. This feature provides a key primitive to protect against
attacks involving pages remapping.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/linux/kvm_host.h  |  24 
 include/uapi/linux/kvm_para.h |   1 +
 virt/kvm/kvm_main.c   |  24 +++-
 virt/kvm/roe.c| 212 --
 virt/kvm/roe_generic.h|   6 +
 5 files changed, 253 insertions(+), 14 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a627c6e81a..9acf5f54ac 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -294,10 +294,34 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
  */
 #define KVM_MEM_MAX_NR_PAGES ((1UL << 31) - 1)
 
+/*
+ * This structure is used to hold memory areas that are to be protected in a
+ * memory frame with mixed page permissions.
+ **/
+struct protected_chunk {
+   gpa_t gpa;
+   u64 size;
+   struct list_head list;
+};
+
+static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* https://stackoverflow.com/questions/325933/
+* determine-whether-two-date-ranges-overlap
+* Assuming that it works, that link ^ provides a solution that is
+* better than anything I would ever come up with.
+*/
+   return (gpa <= chunk->gpa + chunk->size - 1) &&
+   (gpa + len - 1 >= chunk->gpa);
+}
+
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
unsigned long *roe_bitmap;
+   unsigned long *partial_roe_bitmap;
+   struct list_head *prot_list;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index e6004e0750..4a84f974bc 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -33,6 +33,7 @@
 /* ROE Functionality parameters */
 #define ROE_VERSION0
 #define ROE_MPROTECT   1
+#define ROE_MPROTECT_CHUNK 2
 /*
  * hypercalls use architecture specific
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 88b5fbcbb0..819033f475 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1354,18 +1354,19 @@ static bool memslot_is_readonly(struct kvm_memory_slot 
*slot)
 
 static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn)
 {
-   return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot);
+   return gfn_is_full_roe(slot, gfn) ||
+  gfn_is_partial_roe(slot, gfn) ||
+  memslot_is_readonly(slot);
 }
 
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
   gfn_t *nr_pages, bool write)
 {
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
return KVM_HVA_ERR_BAD;
-
if (gfn_is_readonly(slot, gfn) && write)
return KVM_HVA_ERR_RO_BAD;
-
if (nr_pages)
*nr_pages = slot->npages - (gfn - slot->base_gfn);
 
@@ -1927,14 +1928,29 @@ int kvm_vcpu_read_guest_atomic(struct kvm_vcpu *vcpu, 
gpa_t gpa,
return __kvm_read_guest_atomic(slot, gfn, data, offset, len);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic);
+static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
+   int len)
+{
+   u64 addr;
 
+   if (!slot)
+   return KVM_HVA_ERR_RO_BAD;
+   if (kvm_roe_check_range(slot, gfn, offset, len))
+   return KVM_HVA_ERR_RO_BAD;
+   if (memslot_is_readonly(slot))
+   return KVM_HVA_ERR_RO_BAD;
+   if (gfn_is_full_roe(slot, gfn))
+   return KVM_HVA_ERR_RO_BAD;
+   addr = __gfn_to_hva_many(slot, gfn, NULL, false);
+   return addr;
+}
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
int r;
unsigned long addr;
 
-   addr = gfn_to_hva_memslot(memslot, gfn);
+   addr = roe_gfn_to_hva(memslot, gfn, offset, len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 33d3a4f507..4393a6a6a2 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -11,34 +11,89 @@
 #include 
 #include 
 #include 
+#include "roe_generic.h"
 
 int kvm_roe_init(struct kvm_memory_slot *slot)
 {
slot->roe_bitmap = kvzalloc(BITS_TO_LONGS(slot->npages) *
sizeof(unsigned long), GFP_KERNEL);
if (!slot->roe_bitmap)
-   return -ENOMEM;
+   

[RESEND PATCH V8 02/11] KVM: X86: Add arbitrary data pointer in kvm memslot iterator functions

2019-01-21 Thread Ahmed Abd El Mawgood
This will help sharing data into the slot_level_handler callback. In my
case I need to a share a counter for the pages traversed to use it in some
bitmap. Being able to send arbitrary memory pointer into the
slot_level_handler callback made it easy.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 65 ++
 1 file changed, 37 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ce770b4462..098df7d135 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1525,7 +1525,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 
 static bool __rmap_write_protect(struct kvm *kvm,
 struct kvm_rmap_head *rmap_head,
-bool pt_protect)
+bool pt_protect, void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1564,7 +1564,8 @@ static bool wrprot_ad_disabled_spte(u64 *sptep)
  * - W bit on ad-disabled SPTEs.
  * Returns true iff any D or W bits were cleared.
  */
-static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head)
+static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head 
*rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1590,7 +1591,8 @@ static bool spte_set_dirty(u64 *sptep)
return mmu_spte_update(sptep, spte);
 }
 
-static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1622,7 +1624,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false);
+   __rmap_write_protect(kvm, rmap_head, false, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1648,7 +1650,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_clear_dirty(kvm, rmap_head);
+   __rmap_clear_dirty(kvm, rmap_head, NULL);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1701,7 +1703,8 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
 
for (i = PT_PAGE_TABLE_LEVEL; i <= PT_MAX_HUGEPAGE_LEVEL; ++i) {
rmap_head = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(kvm, rmap_head, true);
+   write_protected |= __rmap_write_protect(kvm, rmap_head, true,
+   NULL);
}
 
return write_protected;
@@ -1715,7 +1718,8 @@ static bool rmap_write_protect(struct kvm_vcpu *vcpu, u64 
gfn)
return kvm_mmu_slot_gfn_write_protect(vcpu->kvm, slot, gfn);
 }
 
-static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
+static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
+   void *data)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1735,7 +1739,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct 
kvm_rmap_head *rmap_head,
   struct kvm_memory_slot *slot, gfn_t gfn, int level,
   unsigned long data)
 {
-   return kvm_zap_rmapp(kvm, rmap_head);
+   return kvm_zap_rmapp(kvm, rmap_head, NULL);
 }
 
 static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
@@ -5552,13 +5556,15 @@ void kvm_mmu_uninit_vm(struct kvm *kvm)
 }
 
 /* The return value indicates if tlb flush on all vcpus is needed. */
-typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head 
*rmap_head);
+typedef bool (*slot_level_handler) (struct kvm *kvm,
+   struct kvm_rmap_head *rmap_head, void *data);
 
 /* The caller should hold mmu-lock before calling this function. */
 static __always_inline bool
 slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
slot_level_handler fn, int start_level, int end_level,
-   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb)
+   gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb,
+   void *data)
 {
struct slot_rmap_walk_iterator iterator;
bool flush = false;
@@ -5566,7 +5572,7 @@ slot_handle_level_range(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
for_each_slot_rmap_range(memslot, start_level, end_level, start_gfn,
  

[RESEND PATCH V8 09/11] KVM: Add new exit reason For ROE violations

2019-01-21 Thread Ahmed Abd El Mawgood
The problem is that qemu will not be able to detect ROE violations, so
one option would be create host API to tell if a given page is ROE
protected, or create ROE violation exit reason.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/x86.c   | 10 +-
 include/kvm/roe.h| 12 
 include/uapi/linux/kvm.h |  2 +-
 virt/kvm/kvm_main.c  |  1 +
 virt/kvm/roe.c   |  2 +-
 virt/kvm/roe_generic.h   |  9 +
 6 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 19b0f2307e..368e3d99fd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5409,6 +5409,7 @@ static int emulator_read_write(struct x86_emulate_ctxt 
*ctxt,
const struct read_write_emulator_ops *ops)
 {
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+   struct kvm_memory_slot *slot;
gpa_t gpa;
int rc;
 
@@ -5450,7 +5451,14 @@ static int emulator_read_write(struct x86_emulate_ctxt 
*ctxt,
 
vcpu->run->mmio.len = min(8u, vcpu->mmio_fragments[0].len);
vcpu->run->mmio.is_write = vcpu->mmio_is_write = ops->write;
-   vcpu->run->exit_reason = KVM_EXIT_MMIO;
+   slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa >> PAGE_SHIFT);
+   if (slot && ops->write && (kvm_roe_check_range(slot, gpa>>PAGE_SHIFT,
+   gpa - (gpa & PAGE_MASK), bytes) ||
+   gfn_is_full_roe(slot, gpa>>PAGE_SHIFT)))
+   vcpu->run->exit_reason = KVM_EXIT_ROE;
+   else
+   vcpu->run->exit_reason = KVM_EXIT_MMIO;
+
vcpu->run->mmio.phys_addr = gpa;
 
return ops->read_write_exit_mmio(vcpu, gpa, val, bytes);
diff --git a/include/kvm/roe.h b/include/kvm/roe.h
index 6a86866623..3121a67753 100644
--- a/include/kvm/roe.h
+++ b/include/kvm/roe.h
@@ -13,4 +13,16 @@ void kvm_roe_arch_commit_protection(struct kvm *kvm,
struct kvm_memory_slot *slot);
 int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3);
 bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu);
+bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
+   int len);
+static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return test_bit(gfn - slot->base_gfn, slot->roe_bitmap);
+
+}
+static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap);
+}
+
 #endif
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6d4ea4b6c9..0a386bb5f2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -235,7 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
-
+#define KVM_EXIT_ROE 28
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
 #define KVM_INTERNAL_ERROR_EMULATION   1
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 819033f475..d92d300539 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -62,6 +62,7 @@
 #include "async_pf.h"
 #include "vfio.h"
 #include "roe_generic.h"
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 4393a6a6a2..9540473f89 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -60,7 +60,7 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t 
gfn, int offset,
return false;
return kvm_roe_protected_range(slot, gpa, len);
 }
-
+EXPORT_SYMBOL_GPL(kvm_roe_check_range);
 
 void kvm_roe_free(struct kvm_memory_slot *slot)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index ad121372f2..f1ce4a8aec 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,12 +14,5 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-static inline bool gfn_is_full_roe(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-   return test_bit(gfn - slot->base_gfn, slot->roe_bitmap);
-}
-static inline bool gfn_is_partial_roe(struct kvm_memory_slot *slot, gfn_t gfn)
-{
-   return test_bit(gfn - slot->base_gfn, slot->partial_roe_bitmap);
-}
+
 #endif
-- 
2.19.2



[RESEND PATCH V8 08/11] KVM: X86: Port ROE_MPROTECT_CHUNK to x86

2019-01-21 Thread Ahmed Abd El Mawgood
Apply d->memslot->partial_roe_bitmap to shadow page table entries
too.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/roe.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/roe.c b/arch/x86/kvm/roe.c
index f787106be8..700f69823b 100644
--- a/arch/x86/kvm/roe.c
+++ b/arch/x86/kvm/roe.c
@@ -25,11 +25,14 @@ static bool __rmap_write_protect_roe(struct kvm *kvm,
struct rmap_iterator iter;
bool prot;
bool flush = false;
+   void *full_bmp =  memslot->roe_bitmap;
+   void *part_bmp = memslot->partial_roe_bitmap;
 
for_each_rmap_spte(rmap_head, &iter, sptep) {
int idx = spte_to_gfn(sptep) - memslot->base_gfn;
 
-   prot = !test_bit(idx, memslot->roe_bitmap) && pt_protect;
+   prot = !(test_bit(idx, full_bmp) || test_bit(idx, part_bmp));
+   prot = prot && pt_protect;
flush |= spte_write_protect(sptep, prot);
}
return flush;
-- 
2.19.2



[RESEND PATCH V8 0/11] KVM: X86: Introducing ROE Protection Kernel Hardening

2019-01-21 Thread Ahmed Abd El Mawgood
uot;Actually this is more of an ABI demonstration\n");
pr_info("than actual use case\n");
}
module_init(hello);
module_exit(bye);

```

I tried this on Gentoo host with Ubuntu guest and Qemu from git after applying
the following changes to Qemu

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 4880a05399..57d0973aca 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2035,6 +2035,9 @@ int kvm_cpu_exec(CPUState *cpu)
  run->mmio.is_write);
 ret = 0;
 break;
+   case KVM_EXIT_ROE:
+   ret = 0;
+   break;
 case KVM_EXIT_IRQ_WINDOW_OPEN:
 DPRINTF("irq_window_open\n");
 ret = EXCP_INTERRUPT;
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index f11a7eb49c..67aded8f00 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -235,7 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
-
+#define KVM_EXIT_ROE  28
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
 #define KVM_INTERNAL_ERROR_EMULATION   1



-- Change log V7 -> V8 --

- Bug fix in patch 10, (it didn't work).
- Replacing the linked list structure used to store protected chunks with a red
  black tree. That offered huge performance improvement where the query time
  when writing to a linked list of ~2000 chunks was almost constant.


-- Known Issues --

- THP is not supported yet. In general it is not supported when the guest frame
  size is not the same as the equivalent EPT frame size.

The previous version (V7) of the patch set can be found at [1]

-- links --

[1] https://lkml.org/lkml/2018/12/7/345
[2] https://lkml.org/lkml/2018/12/21/340

-- List of patches --

[PATCH V8 01/11] KVM: State whether memory should be freed in
[PATCH V8 02/11] KVM: X86: Add arbitrary data pointer in kvm memslot
[PATCH V8 03/11] KVM: X86: Add helper function to convert SPTE to GFN
[PATCH V8 04/11] KVM: Document Memory ROE
[PATCH V8 05/11] KVM: Create architecture independent ROE skeleton
[PATCH V8 06/11] KVM: X86: Enable ROE for x86
[PATCH V8 07/11] KVM: Add support for byte granular memory ROE
[PATCH V8 08/11] KVM: X86: Port ROE_MPROTECT_CHUNK to x86
[PATCH V8 09/11] KVM: Add new exit reason For ROE violations
[PATCH V8 10/11] KVM: Log ROE violations in system log
[PATCH V8 11/11] KVM: ROE: Store protected chunks in red black tree

-- Difstat --

Documentation/virtual/kvm/hypercalls.txt |  40 +++
arch/x86/include/asm/kvm_host.h  |   2 +-
arch/x86/kvm/Makefile|   4 +-
arch/x86/kvm/mmu.c   | 121 -
arch/x86/kvm/mmu.h   |  31 ++-
arch/x86/kvm/roe.c   | 104 
arch/x86/kvm/roe_arch.h  |  28 ++
arch/x86/kvm/x86.c   |  21 +-
include/kvm/roe.h|  28 ++
include/linux/kvm_host.h |  57 
include/uapi/linux/kvm.h |   2 +-
include/uapi/linux/kvm_para.h|   5 +
virt/kvm/kvm_main.c  |  54 +++-
virt/kvm/roe.c   | 445 +++
virt/kvm/roe_generic.h   |  22 ++
15 files changed, 868 insertions(+), 96 deletions(-)


Signed-off-by: Ahmed Abd El Mawgood 


[RESEND PATCH V8 03/11] KVM: X86: Add helper function to convert SPTE to GFN

2019-01-21 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/kvm/mmu.c | 7 +++
 arch/x86/kvm/mmu.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 098df7d135..bbfe3f2863 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1053,6 +1053,13 @@ static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page 
*sp, int index)
 
return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS));
 }
+gfn_t spte_to_gfn(u64 *spte)
+{
+   struct kvm_mmu_page *sp;
+
+   sp = page_header(__pa(spte));
+   return kvm_mmu_page_get_gfn(sp, spte - sp->spt);
+}
 
 static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn)
 {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index c7b333147c..49d7f2f002 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -211,4 +211,5 @@ void kvm_mmu_gfn_allow_lpage(struct kvm_memory_slot *slot, 
gfn_t gfn);
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
struct kvm_memory_slot *slot, u64 gfn);
 int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
+gfn_t spte_to_gfn(u64 *sptep);
 #endif
-- 
2.19.2



[RESEND PATCH V8 06/11] KVM: X86: Enable ROE for x86

2019-01-21 Thread Ahmed Abd El Mawgood
This patch implements kvm_roe_arch_commit_protection and
kvm_roe_arch_is_userspace for x86, and invoke kvm_roe via the
appropriate vmcall.

Signed-off-by: Ahmed Abd El Mawgood 
---
 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/Makefile   |   4 +-
 arch/x86/kvm/mmu.c  |  71 +-
 arch/x86/kvm/mmu.h  |  30 +-
 arch/x86/kvm/roe.c  | 101 
 arch/x86/kvm/roe_arch.h |  28 +
 arch/x86/kvm/x86.c  |  11 ++--
 7 files changed, 183 insertions(+), 64 deletions(-)
 create mode 100644 arch/x86/kvm/roe.c
 create mode 100644 arch/x86/kvm/roe_arch.h

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4660ce90de..797d838c3e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1239,7 +1239,7 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 
accessed_mask,
u64 acc_track_mask, u64 me_mask);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
-void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
+void kvm_mmu_slot_apply_write_access(struct kvm *kvm,
  struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
   const struct kvm_memory_slot *memslot);
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 69b3a7c300..39f7766afe 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -9,7 +9,9 @@ CFLAGS_vmx.o := -I.
 KVM := ../../../virt/kvm
 
 kvm-y  += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
-   $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o
+  $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o \
+  $(KVM)/roe.o roe.o
+
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
 
 kvm-y  += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bbfe3f2863..2e3a43076e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -23,7 +23,7 @@
 #include "x86.h"
 #include "kvm_cache_regs.h"
 #include "cpuid.h"
-
+#include "roe_arch.h"
 #include 
 #include 
 #include 
@@ -1343,8 +1343,8 @@ static void pte_list_remove(struct kvm_rmap_head 
*rmap_head, u64 *sptep)
__pte_list_remove(sptep, rmap_head);
 }
 
-static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
-  struct kvm_memory_slot *slot)
+struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
+   struct kvm_memory_slot *slot)
 {
unsigned long idx;
 
@@ -1394,16 +1394,6 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
__pte_list_remove(spte, rmap_head);
 }
 
-/*
- * Used by the following functions to iterate through the sptes linked by a
- * rmap.  All fields are private and not assumed to be used outside.
- */
-struct rmap_iterator {
-   /* private fields */
-   struct pte_list_desc *desc; /* holds the sptep if not NULL */
-   int pos;/* index of the sptep */
-};
-
 /*
  * Iteration must be started by this function.  This should also be used after
  * removing/dropping sptes from the rmap link because in such cases the
@@ -1411,8 +1401,7 @@ struct rmap_iterator {
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_first(struct kvm_rmap_head *rmap_head,
-  struct rmap_iterator *iter)
+u64 *rmap_get_first(struct kvm_rmap_head *rmap_head, struct rmap_iterator 
*iter)
 {
u64 *sptep;
 
@@ -1438,7 +1427,7 @@ static u64 *rmap_get_first(struct kvm_rmap_head 
*rmap_head,
  *
  * Returns sptep if found, NULL otherwise.
  */
-static u64 *rmap_get_next(struct rmap_iterator *iter)
+u64 *rmap_get_next(struct rmap_iterator *iter)
 {
u64 *sptep;
 
@@ -1513,7 +1502,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
  *
  * Return true if tlb need be flushed.
  */
-static bool spte_write_protect(u64 *sptep, bool pt_protect)
+bool spte_write_protect(u64 *sptep, bool pt_protect)
 {
u64 spte = *sptep;
 
@@ -1531,8 +1520,7 @@ static bool spte_write_protect(u64 *sptep, bool 
pt_protect)
 }
 
 static bool __rmap_write_protect(struct kvm *kvm,
-struct kvm_rmap_head *rmap_head,
-bool pt_protect, void *data)
+   struct kvm_rmap_head *rmap_head, bool pt_protect)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1631,7 +1619,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm 
*kvm,
while (mask) {
rmap_head = __gfn_to_rmap(slot->base_gfn + gfn_offset + 
__ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmap_head, false, NULL);
+  

[RESEND PATCH V8 04/11] KVM: Document Memory ROE

2019-01-21 Thread Ahmed Abd El Mawgood
ROE version documented here is implemented in the next 2 patches
Signed-off-by: Ahmed Abd El Mawgood 
---
 Documentation/virtual/kvm/hypercalls.txt | 40 
 1 file changed, 40 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt 
b/Documentation/virtual/kvm/hypercalls.txt
index da24c138c8..a31f316ce6 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -141,3 +141,43 @@ a0 corresponds to the APIC ID in the third argument (a2), 
bit 1
 corresponds to the APIC ID a2+1, and so on.
 
 Returns the number of CPUs to which the IPIs were delivered successfully.
+
+7. KVM_HC_ROE
+
+Architecture: x86
+Status: active
+Purpose: Hypercall used to apply Read-Only Enforcement to guest memory and
+registers
+Usage 1:
+ a0: ROE_VERSION
+
+Returns non-signed number that represents the current version of ROE
+implementation current version.
+
+Usage 2:
+
+ a0: ROE_MPROTECT  (requires version >= 1)
+ a1: Start address aligned to page boundary.
+ a2: Number of pages to be protected.
+
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only.  This action is irreversible.
+Upon successful run, the number of pages protected is returned.
+
+Usage 3:
+ a0: ROE_MPROTECT_CHUNK(requires version >= 2)
+ a1: Start address aligned to page boundary.
+ a2: Number of bytes to be protected.
+This configuration lets a guest kernel have part of its read/write memory
+converted into read-only with bytes granularity. ROE_MPROTECT_CHUNK is
+relatively slow compared to ROE_MPROTECT. This action is irreversible.
+Upon successful run, the number of bytes protected is returned.
+
+Error codes:
+   -KVM_ENOSYS: system call being triggered from ring 3 or it is not
+   implemented.
+   -EINVAL: error based on given parameters.
+
+Notes: KVM_HC_ROE can not be triggered from guest Ring 3 (user mode). The
+reason is that user mode malicious software can make use of it to enforce read
+only protection on an arbitrary memory page thus crashing the kernel.
-- 
2.19.2



[RESEND PATCH V8 01/11] KVM: State whether memory should be freed in kvm_free_memslot

2019-01-21 Thread Ahmed Abd El Mawgood
The conditions upon which kvm_free_memslot are kind of ad-hock,
it will be hard to extend memslot with allocatable data that needs to be
freed, so I replaced the current mechanism by clear flag that states if
the memory slot should be freed.

Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1f888a103f..2f37b4b6a2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -548,9 +548,10 @@ static void kvm_destroy_dirty_bitmap(struct 
kvm_memory_slot *memslot)
  * Free any memory in @free but not in @dont.
  */
 static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free,
- struct kvm_memory_slot *dont)
+ struct kvm_memory_slot *dont,
+ enum kvm_mr_change change)
 {
-   if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
+   if (change == KVM_MR_DELETE)
kvm_destroy_dirty_bitmap(free);
 
kvm_arch_free_memslot(kvm, free, dont);
@@ -566,7 +567,7 @@ static void kvm_free_memslots(struct kvm *kvm, struct 
kvm_memslots *slots)
return;
 
kvm_for_each_memslot(memslot, slots)
-   kvm_free_memslot(kvm, memslot, NULL);
+   kvm_free_memslot(kvm, memslot, NULL, KVM_MR_DELETE);
 
kvfree(slots);
 }
@@ -1061,14 +1062,14 @@ int __kvm_set_memory_region(struct kvm *kvm,
 
kvm_arch_commit_memory_region(kvm, mem, &old, &new, change);
 
-   kvm_free_memslot(kvm, &old, &new);
+   kvm_free_memslot(kvm, &old, &new, change);
kvfree(old_memslots);
return 0;
 
 out_slots:
kvfree(slots);
 out_free:
-   kvm_free_memslot(kvm, &new, &old);
+   kvm_free_memslot(kvm, &new, &old, change);
 out:
return r;
 }
-- 
2.19.2



[RESEND PATCH V8 11/11] KVM: ROE: Store protected chunks in red black tree

2019-01-21 Thread Ahmed Abd El Mawgood
The old way of storing protected chunks was a linked list. That made
linear overhead when searching for chunks. When reaching 2000 chunk, The
time taken two read the last chunk was about 10 times slower than the
first chunk. This patch stores the chunks as tree for faster search.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/linux/kvm_host.h |  36 ++-
 virt/kvm/roe.c   | 228 +++
 virt/kvm/roe_generic.h   |   3 +
 3 files changed, 197 insertions(+), 70 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9acf5f54ac..5f4bec0662 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -301,7 +302,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
 struct protected_chunk {
gpa_t gpa;
u64 size;
-   struct list_head list;
+   struct rb_node node;
 };
 
 static inline bool kvm_roe_range_overlap(struct protected_chunk *chunk,
@@ -316,12 +317,43 @@ static inline bool kvm_roe_range_overlap(struct 
protected_chunk *chunk,
(gpa + len - 1 >= chunk->gpa);
 }
 
+static inline int kvm_roe_range_cmp_position(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* returns -1 if the gpa and len are smaller than chunk.
+* returns 0 if they overlap or strictly adjacent
+* returns 1 if gpa and len are bigger than the chunk
+*/
+
+   if (gpa + len <= chunk->gpa)
+   return -1;
+   if (gpa >= chunk->gpa + chunk->size)
+   return 1;
+   return 0;
+}
+
+static inline int kvm_roe_range_cmp_mergability(struct protected_chunk *chunk,
+   gpa_t gpa, int len) {
+   /*
+* returns -1 if the gpa and len are smaller than chunk and not adjacent
+* to it
+* returns 0 if they overlap or strictly adjacent
+* returns 1 if gpa and len are bigger than the chunk and not adjacent
+* to it
+*/
+   if (gpa + len < chunk->gpa)
+   return -1;
+   if (gpa > chunk->gpa + chunk->size)
+   return 1;
+   return 0;
+
+}
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
unsigned long *roe_bitmap;
unsigned long *partial_roe_bitmap;
-   struct list_head *prot_list;
+   struct rb_root  *prot_root;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index e424b45e1c..15297c0e57 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -23,10 +23,10 @@ int kvm_roe_init(struct kvm_memory_slot *slot)
sizeof(unsigned long), GFP_KERNEL);
if (!slot->partial_roe_bitmap)
goto fail2;
-   slot->prot_list = kvzalloc(sizeof(struct list_head), GFP_KERNEL);
-   if (!slot->prot_list)
+   slot->prot_root = kvzalloc(sizeof(struct rb_root), GFP_KERNEL);
+   if (!slot->prot_root)
goto fail3;
-   INIT_LIST_HEAD(slot->prot_list);
+   *slot->prot_root = RB_ROOT;
return 0;
 fail3:
kvfree(slot->partial_roe_bitmap);
@@ -40,12 +40,19 @@ int kvm_roe_init(struct kvm_memory_slot *slot)
 static bool kvm_roe_protected_range(struct kvm_memory_slot *slot, gpa_t gpa,
int len)
 {
-   struct list_head *pos;
-   struct protected_chunk *cur_chunk;
-
-   list_for_each(pos, slot->prot_list) {
-   cur_chunk = list_entry(pos, struct protected_chunk, list);
-   if (kvm_roe_range_overlap(cur_chunk, gpa, len))
+   struct rb_node *node = slot->prot_root->rb_node;
+
+   while (node) {
+   struct protected_chunk *cur_chunk;
+   int cmp;
+
+   cur_chunk = rb_entry(node, struct protected_chunk, node);
+   cmp = kvm_roe_range_cmp_position(cur_chunk, gpa, len);
+   if (cmp < 0)/*target chunk is before current node*/
+   node = node->rb_left;
+   else if (cmp > 0)/*target chunk is after current node*/
+   node = node->rb_right;
+   else
return true;
}
return false;
@@ -62,18 +69,24 @@ bool kvm_roe_check_range(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
 }
 EXPORT_SYMBOL_GPL(kvm_roe_check_range);
 
-void kvm_roe_free(struct kvm_memory_slot *slot)
+static void kvm_roe_destroy_tree(struct rb_node *node)
 {
-   struct protected_chunk *pos, *n;
-   struct list_head *head = slot->prot_list;
+   struct protected_chunk *cur_chunk;
+
+   if (!node)
+   return;
+   kvm_roe_destroy_tree(node->rb_left);
+   kvm_roe_destroy_tree(node->rb_right);
+   cur_chunk = rb_ent

[RESEND PATCH V8 10/11] KVM: Log ROE violations in system log

2019-01-21 Thread Ahmed Abd El Mawgood
Signed-off-by: Ahmed Abd El Mawgood 
---
 virt/kvm/kvm_main.c|  3 ++-
 virt/kvm/roe.c | 25 +
 virt/kvm/roe_generic.h |  3 ++-
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d92d300539..b3dc7255b0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1945,13 +1945,14 @@ static u64 roe_gfn_to_hva(struct kvm_memory_slot *slot, 
gfn_t gfn, int offset,
addr = __gfn_to_hva_many(slot, gfn, NULL, false);
return addr;
 }
+
 static int __kvm_write_guest_page(struct kvm_memory_slot *memslot, gfn_t gfn,
  const void *data, int offset, int len)
 {
int r;
unsigned long addr;
-
addr = roe_gfn_to_hva(memslot, gfn, offset, len);
+   kvm_roe_check_and_log(memslot, gfn, data, offset, len);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = __copy_to_user((void __user *)addr + offset, data, len);
diff --git a/virt/kvm/roe.c b/virt/kvm/roe.c
index 9540473f89..e424b45e1c 100644
--- a/virt/kvm/roe.c
+++ b/virt/kvm/roe.c
@@ -76,6 +76,31 @@ void kvm_roe_free(struct kvm_memory_slot *slot)
kvfree(slot->prot_list);
 }
 
+static void kvm_warning_roe_violation(u64 addr, const void *data, int len)
+{
+   int i;
+   const char *d = data;
+   char *buf = kvmalloc(len * 3 + 1, GFP_KERNEL);
+
+   for (i = 0; i < len; i++)
+   sprintf(buf+3*i, " %02x", d[i]);
+   pr_warn("ROE violation:\n");
+   pr_warn("\tAttempt to write %d bytes at address 0x%08llx\n", len, addr);
+   pr_warn("\tData: %s\n", buf);
+   kvfree(buf);
+}
+
+void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn,
+   const void *data, int offset, int len)
+{
+   if (!memslot)
+   return;
+   if (!gfn_is_full_roe(memslot, gfn) &&
+   !kvm_roe_check_range(memslot, gfn, offset, len))
+   return;
+   kvm_warning_roe_violation((gfn << PAGE_SHIFT) + offset, data, len);
+}
+
 static void kvm_roe_protect_slot(struct kvm *kvm, struct kvm_memory_slot *slot,
gfn_t gfn, u64 npages, bool partial)
 {
diff --git a/virt/kvm/roe_generic.h b/virt/kvm/roe_generic.h
index f1ce4a8aec..6c5f0cf381 100644
--- a/virt/kvm/roe_generic.h
+++ b/virt/kvm/roe_generic.h
@@ -14,5 +14,6 @@ void kvm_roe_free(struct kvm_memory_slot *slot);
 int kvm_roe_init(struct kvm_memory_slot *slot);
 bool kvm_roe_check_range(struct kvm_memory_slot *slot, gfn_t gfn, int offset,
int len);
-
+void kvm_roe_check_and_log(struct kvm_memory_slot *memslot, gfn_t gfn,
+   const void *data, int offset, int len);
 #endif
-- 
2.19.2



[RESEND PATCH V8 05/11] KVM: Create architecture independent ROE skeleton

2019-01-21 Thread Ahmed Abd El Mawgood
This patch introduces a hypercall that can assist against subset of kernel
rootkits, it works by place readonly protection in shadow PTE. The end
result protection is also kept in a bitmap for each kvm_memory_slot and is
used as reference when updating SPTEs. The whole goal is to protect the
guest kernel static data from modification if attacker is running from
guest ring 0, for this reason there is no hypercall to revert effect of
Memory ROE hypercall. This patch doesn't implement integrity check on guest
TLB so obvious attack on the current implementation will involve guest
virtual address -> guest physical address remapping, but there are plans to
fix that. For this patch to work on a given arch/ one would need to
implement 2 function that are architecture specific:
kvm_roe_arch_commit_protection() and kvm_roe_arch_is_userspace(). Also it
would need to have kvm_roe invoked using the appropriate hypercall
mechanism.

Signed-off-by: Ahmed Abd El Mawgood 
---
 include/kvm/roe.h |  16 
 include/linux/kvm_host.h  |   1 +
 include/uapi/linux/kvm_para.h |   4 +
 virt/kvm/kvm_main.c   |  19 +++--
 virt/kvm/roe.c| 136 ++
 virt/kvm/roe_generic.h|  19 +
 6 files changed, 190 insertions(+), 5 deletions(-)
 create mode 100644 include/kvm/roe.h
 create mode 100644 virt/kvm/roe.c
 create mode 100644 virt/kvm/roe_generic.h

diff --git a/include/kvm/roe.h b/include/kvm/roe.h
new file mode 100644
index 00..6a86866623
--- /dev/null
+++ b/include/kvm/roe.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __KVM_ROE_H__
+#define __KVM_ROE_H__
+/*
+ * KVM Read Only Enforcement
+ * Copyright (c) 2018 Ahmed Abd El Mawgood
+ *
+ * Author Ahmed Abd El Mawgood 
+ *
+ */
+void kvm_roe_arch_commit_protection(struct kvm *kvm,
+   struct kvm_memory_slot *slot);
+int kvm_roe(struct kvm_vcpu *vcpu, u64 a0, u64 a1, u64 a2, u64 a3);
+bool kvm_roe_arch_is_userspace(struct kvm_vcpu *vcpu);
+#endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c38cc5eb7e..a627c6e81a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -297,6 +297,7 @@ static inline int kvm_vcpu_exiting_guest_mode(struct 
kvm_vcpu *vcpu)
 struct kvm_memory_slot {
gfn_t base_gfn;
unsigned long npages;
+   unsigned long *roe_bitmap;
unsigned long *dirty_bitmap;
struct kvm_arch_memory_slot arch;
unsigned long userspace_addr;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 6c0ce49931..e6004e0750 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -28,7 +28,11 @@
 #define KVM_HC_MIPS_CONSOLE_OUTPUT 8
 #define KVM_HC_CLOCK_PAIRING   9
 #define KVM_HC_SEND_IPI10
+#define KVM_HC_ROE 11
 
+/* ROE Functionality parameters */
+#define ROE_VERSION0
+#define ROE_MPROTECT   1
 /*
  * hypercalls use architecture specific
  */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2f37b4b6a2..88b5fbcbb0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -61,6 +61,7 @@
 #include "coalesced_mmio.h"
 #include "async_pf.h"
 #include "vfio.h"
+#include "roe_generic.h"
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -551,9 +552,10 @@ static void kvm_free_memslot(struct kvm *kvm, struct 
kvm_memory_slot *free,
  struct kvm_memory_slot *dont,
  enum kvm_mr_change change)
 {
-   if (change == KVM_MR_DELETE)
+   if (change == KVM_MR_DELETE) {
+   kvm_roe_free(free);
kvm_destroy_dirty_bitmap(free);
-
+   }
kvm_arch_free_memslot(kvm, free, dont);
 
free->npages = 0;
@@ -1018,6 +1020,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
if (kvm_create_dirty_bitmap(&new) < 0)
goto out_free;
}
+   if (kvm_roe_init(&new) < 0)
+   goto out_free;
 
slots = kvzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
if (!slots)
@@ -1348,13 +1352,18 @@ static bool memslot_is_readonly(struct kvm_memory_slot 
*slot)
return slot->flags & KVM_MEM_READONLY;
 }
 
+static bool gfn_is_readonly(struct kvm_memory_slot *slot, gfn_t gfn)
+{
+   return gfn_is_full_roe(slot, gfn) || memslot_is_readonly(slot);
+}
+
 static unsigned long __gfn_to_hva_many(struct kvm_memory_slot *slot, gfn_t gfn,
   gfn_t *nr_pages, bool write)
 {
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
return KVM_HVA_ERR_BAD;
 
-   if (memslot_is_readonly(slot) && write)
+   if (gfn_is_readonly(slot, gfn) && write)
return KVM_HVA_ERR_RO_BAD;
 
if (nr_pages)
@@ -1402,7 +1411,7 @@ unsigned long gfn_to_hva_mems

[PATCH] code cleanups for crypto/blkcipher.c

2015-07-24 Thread Ahmed Mohamed Abd EL Mawgood

fixing all errors and most warning of checkpatch.pl for
crypto/blkcipher.c

Signed-off-by: Ahmed Mohamed 
---
 crypto/blkcipher.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/crypto/blkcipher.c b/crypto/blkcipher.c
index 11b9814..4530ff9 100644
--- a/crypto/blkcipher.c
+++ b/crypto/blkcipher.c
@@ -1,6 +1,6 @@
 /*
  * Block chaining cipher operations.
- * 
+ *
  * Generic encrypt/decrypt wrapper for ciphers, handles operations across
  * multiple page boundaries by using temporary blocks.  In user context,
  * the kernel is given a chance to schedule us once per page.
@@ -9,7 +9,7 @@
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the Free
- * Software Foundation; either version 2 of the License, or (at your option) 
+ * Software Foundation; either version 2 of the License, or (at your option)
  * any later version.
  *
  */
@@ -68,6 +68,7 @@ static inline void blkcipher_unmap_dst(struct blkcipher_walk 
*walk)
 static inline u8 *blkcipher_get_spot(u8 *start, unsigned int len)
 {
u8 *end_page = (u8 *)(((unsigned long)(start + len - 1)) & PAGE_MASK);
+
return max(start, end_page);
 }
 
@@ -334,6 +335,7 @@ static int blkcipher_walk_first(struct blkcipher_desc *desc,
walk->iv = desc->info;
if (unlikely(((unsigned long)walk->iv & walk->alignmask))) {
int err = blkcipher_copy_iv(walk);
+
if (err)
return err;
}
@@ -541,7 +543,7 @@ static void crypto_blkcipher_show(struct seq_file *m, 
struct crypto_alg *alg)
__attribute__ ((unused));
 static void crypto_blkcipher_show(struct seq_file *m, struct crypto_alg *alg)
 {
-   seq_printf(m, "type : blkcipher\n");
+   seq_puts(m, "type : blkcipher");
seq_printf(m, "blocksize: %u\n", alg->cra_blocksize);
seq_printf(m, "min keysize  : %u\n", alg->cra_blkcipher.min_keysize);
seq_printf(m, "max keysize  : %u\n", alg->cra_blkcipher.max_keysize);
@@ -567,7 +569,7 @@ static int crypto_grab_nivcipher(struct 
crypto_skcipher_spawn *spawn,
int err;
 
type = crypto_skcipher_type(type);
-   mask = crypto_skcipher_mask(mask)| CRYPTO_ALG_GENIV;
+   mask = crypto_skcipher_mask(mask) | CRYPTO_ALG_GENIV;
 
alg = crypto_alg_mod_lookup(name, type, mask);
if (IS_ERR(alg))
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] cleanups for crypto/wp512.c

2015-07-24 Thread Ahmed Mohamed Abd EL Mawgood

fixing all errors and warning of checkpatch.pl for
crypto/wp512.c
Signed-off-by: Ahmed Mohamed 
---
 crypto/wp512.c | 120 -
 1 file changed, 60 insertions(+), 60 deletions(-)

diff --git a/crypto/wp512.c b/crypto/wp512.c
index 7ee5a04..8a26965 100644
--- a/crypto/wp512.c
+++ b/crypto/wp512.c
@@ -779,7 +779,8 @@ static const u64 rc[WHIRLPOOL_ROUNDS] = {
  * The core Whirlpool transform.
  */
 
-static void wp512_process_buffer(struct wp512_ctx *wctx) {
+static void wp512_process_buffer(struct wp512_ctx *wctx)
+{
int i, r;
u64 K[8];/* the round key */
u64 block[8];/* mu(buffer) */
@@ -801,78 +802,78 @@ static void wp512_process_buffer(struct wp512_ctx *wctx) {
 
for (r = 0; r < WHIRLPOOL_ROUNDS; r++) {
 
-   L[0] = C0[(int)(K[0] >> 56)   ] ^
+   L[0] = C0[(int)(K[0] >> 56)]^
   C1[(int)(K[7] >> 48) & 0xff] ^
   C2[(int)(K[6] >> 40) & 0xff] ^
   C3[(int)(K[5] >> 32) & 0xff] ^
   C4[(int)(K[4] >> 24) & 0xff] ^
   C5[(int)(K[3] >> 16) & 0xff] ^
   C6[(int)(K[2] >>  8) & 0xff] ^
-  C7[(int)(K[1]  ) & 0xff] ^
+  C7[(int)(K[1]) & 0xff]   ^
   rc[r];
 
-   L[1] = C0[(int)(K[1] >> 56)   ] ^
+   L[1] = C0[(int)(K[1] >> 56)]^
   C1[(int)(K[0] >> 48) & 0xff] ^
   C2[(int)(K[7] >> 40) & 0xff] ^
   C3[(int)(K[6] >> 32) & 0xff] ^
   C4[(int)(K[5] >> 24) & 0xff] ^
   C5[(int)(K[4] >> 16) & 0xff] ^
   C6[(int)(K[3] >>  8) & 0xff] ^
-  C7[(int)(K[2]  ) & 0xff];
+  C7[(int)(K[2]) & 0xff];
 
-   L[2] = C0[(int)(K[2] >> 56)   ] ^
+   L[2] = C0[(int)(K[2] >> 56)]^
   C1[(int)(K[1] >> 48) & 0xff] ^
   C2[(int)(K[0] >> 40) & 0xff] ^
   C3[(int)(K[7] >> 32) & 0xff] ^
   C4[(int)(K[6] >> 24) & 0xff] ^
   C5[(int)(K[5] >> 16) & 0xff] ^
   C6[(int)(K[4] >>  8) & 0xff] ^
-  C7[(int)(K[3]  ) & 0xff];
+  C7[(int)(K[3]) & 0xff];
 
-   L[3] = C0[(int)(K[3] >> 56)   ] ^
+   L[3] = C0[(int)(K[3] >> 56)]^
   C1[(int)(K[2] >> 48) & 0xff] ^
   C2[(int)(K[1] >> 40) & 0xff] ^
   C3[(int)(K[0] >> 32) & 0xff] ^
   C4[(int)(K[7] >> 24) & 0xff] ^
   C5[(int)(K[6] >> 16) & 0xff] ^
   C6[(int)(K[5] >>  8) & 0xff] ^
-  C7[(int)(K[4]  ) & 0xff];
+  C7[(int)(K[4]) & 0xff];
 
-   L[4] = C0[(int)(K[4] >> 56)   ] ^
+   L[4] = C0[(int)(K[4] >> 56)]^
   C1[(int)(K[3] >> 48) & 0xff] ^
   C2[(int)(K[2] >> 40) & 0xff] ^
   C3[(int)(K[1] >> 32) & 0xff] ^
   C4[(int)(K[0] >> 24) & 0xff] ^
   C5[(int)(K[7] >> 16) & 0xff] ^
   C6[(int)(K[6] >>  8) & 0xff] ^
-  C7[(int)(K[5]  ) & 0xff];
+  C7[(int)(K[5]) & 0xff];
 
-   L[5] = C0[(int)(K[5] >> 56)   ] ^
+   L[5] = C0[(int)(K[5] >> 56)]^
   C1[(int)(K[4] >> 48) & 0xff] ^
   C2[(int)(K[3] >> 40) & 0xff] ^
   C3[(int)(K[2] >> 32) & 0xff] ^
   C4[(int)(K[1] >> 24) & 0xff] ^
   C5[(int)(K[0] >> 16) & 0xff] ^
   C6[(int)(K[7] >>  8) & 0xff] ^
-  C7[(int)(K[6]  ) & 0xff];
+  C7[(int)(K[6]) & 0xff];
 
-   L[6] = C0[(int)(K[6] >> 56)   ] ^
+   L[6] = C0[(int)(K[6] >> 56)]^
   C1[(int)(K[5] >> 48) & 0xff] ^
   C2[(int)(K[4] >> 40) & 0xff] ^
   C3[(int)(K[3] >> 32) & 0xff] ^
   C4[(int)(K[2] >> 24) & 0xff] ^
   C5[(int)(K[1] >> 16) & 0xff] ^
   C6[(int)(K[0] >>  8) & 0xff] ^
-  C7[(int)(K[7]  ) & 0xff];
+  C7[(int)(K[7]) & 0xff];
 
-   L[7] = C0[(int)(K[7] >> 56)   ] ^
+   L[7] = C0[(int)(K[7] >> 56)]^
   C1[(int)(K[6] >

[PATCH] Security.c: fix 3 coding style indentation errors

2015-07-16 Thread Ahmed Mohamed Abd EL Mawgood
>From 34330e77b9dfec43e65de069e83ca41cc145ed07 Mon Sep 17 00:00:00 2001
From: ahmed 
Date: Thu, 16 Jul 2015 17:12:52 +0200
Subject: [PATCH] Security.c: fix 3 coding style indentation errors

This is my first patch to get my hand dirty
3 simple indentation errors fixing withen security/security.c
Signed-off-by: Ahmed Mohamed 
---
 security/security.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/security/security.c b/security/security.c
index 595fffa..99c2d3b 100644
--- a/security/security.c
+++ b/security/security.c
@@ -308,7 +308,7 @@ int security_sb_statfs(struct dentry *dentry)
 }
 
 int security_sb_mount(const char *dev_name, struct path *path,
-   const char *type, unsigned long flags, void *data)
+   const char *type, unsigned long flags, void *data)
 {
return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
 }
@@ -567,8 +567,8 @@ int security_inode_rename(struct inode *old_dir, struct 
dentry *old_dentry,
   struct inode *new_dir, struct dentry *new_dentry,
   unsigned int flags)
 {
-if (unlikely(IS_PRIVATE(d_backing_inode(old_dentry)) ||
-(d_is_positive(new_dentry) && 
IS_PRIVATE(d_backing_inode(new_dentry)
+   if (unlikely(IS_PRIVATE(d_backing_inode(old_dentry)) ||
+   (d_is_positive(new_dentry) && IS_PRIVATE(d_backing_inode(new_dentry)
return 0;
 
if (flags & RENAME_EXCHANGE) {
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >