GPU Implementation of image processing algorithms

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

GPU Implementation of image processing algorithms

kunal ghosh
Hi,
I am implementing the face recognition algorithm for digikam, and wanted to use GPGPU frameworks for the same. But
was not able to decide which framework to use OpenCL or CUDA (C for CUDA specifically).

PS: I am willingly not including any more information about either of the above frameworks to attract unbiased opinions.

Also, i could code part that would execute on the GPU in python , shortening the development cycle. Good python bindings exist for either GPU programming frameworks. Py[CUDA,OpenCL] are the bindings.

Python functions are easily callable from within C/C++ code as demonstrated by Link 1 Link2 and Link3
so, is it fine if the algorithms are implemented in python and then called from within digikam.

all suggestions , comments welcome.

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Aditya Bhatt
Hi Kunal,

If you wish to use GPGPU, OpenCL is the right choice over CUDA. CUDA's API is not only proprietary, it also is specifically for nVidia hardware. OpenCL is good in this sense. If you have an nVidia card, it seems that OpenCL will internally use CUDA, therefore OpenCL will work on all users' computers.

However, I'm not sure, but there seems to be a small problem with adoption : http://www.khronos.org/opencl/adopters/
There are some weird conditions regarding publishable usage that I'm not entirely sure about - it seems that you must gain some sort of approval and pass some tests before you are allowed to say that you used OpenCL in digiKam. If one doesn't want to publish his/her code, but keep it for personal/closed usage, then you don't have to pay royalty.
Please correct me if I'm wrong, since this seems free as in speech/usage, but not free as in beer - there are some royalty issues if you don't pass the conformance tests.

As a side note - I had a talk with Alex about using EBGM a few days ago, and we decided not to use it for the moment. We don't have anything against the algorithm, it's just that if only I did it, there won't be enough time to implement all the algorithms and also complete the tagging part within the GSoC period. We definitely want to have eigenface and fisherface, despite the limitations - the retraining is slow only if there are more than ~400 tagged friends in the database, which is a rarity. The main concern is pose variation - for that, I plan to link multiple poses of the same person with the same ID. As a consequence, the initial accuracy while training shall be less, but after some time it would be good enough.

But still, if you can implement EBGM for libface, it'd be great :) We'd have one more algorithm in the bag. It's just that one person can't finish everything in time. So since you're willing, start EBGM then. Eigenfaces is almost complete.

PS: Hopefully someone will figure out if OpenCL can be used in digiKam or not.

Cheers


On Sat, Apr 17, 2010 at 9:21 PM, kunal ghosh <[hidden email]> wrote:
Hi,
I am implementing the face recognition algorithm for digikam, and wanted to use GPGPU frameworks for the same. But
was not able to decide which framework to use OpenCL or CUDA (C for CUDA specifically).

PS: I am willingly not including any more information about either of the above frameworks to attract unbiased opinions.

Also, i could code part that would execute on the GPU in python , shortening the development cycle. Good python bindings exist for either GPU programming frameworks. Py[CUDA,OpenCL] are the bindings.

Python functions are easily callable from within C/C++ code as demonstrated by Link 1 Link2 and Link3
so, is it fine if the algorithms are implemented in python and then called from within digikam.

all suggestions , comments welcome.

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel




--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

kunal ghosh
Thanks for the reply Aditya,
comments in-lined,

If you wish to use GPGPU, OpenCL is the right choice over CUDA. CUDA's API is not only proprietary, it also is specifically for nVidia hardware. OpenCL is good in this sense. If you have an nVidia card, it seems that OpenCL will internally use CUDA, therefore OpenCL will work on all users' computers.
 
 That is my opinion too.

However, I'm not sure, but there seems to be a small problem with adoption : http://www.khronos.org/opencl/adopters/
There are some weird conditions regarding publishable usage that I'm not entirely sure about - it seems that you must gain some sort of approval and pass some tests before you are allowed to say that you used OpenCL in digiKam. If one doesn't want to publish his/her code, but keep it for personal/closed usage, then you don't have to pay royalty.
Please correct me if I'm wrong, since this seems free as in speech/usage, but not free as in beer - there are some royalty issues if you don't pass the conformance tests.
 
Actually, there is a small interpretation error, the conditions regarding publishable usage is only for the adopters (ie. NVIDIA or ATI etc http://www.khronos.org/members/conformant/ ) who come up with OpenCL compliant drivers / API implementation. As mentioned in the link "main adopters page" we as implementers should not have any problems. (Quoting from the site:http://www.khronos.org/adopters)

"  Implementers - for no cost or license fees you may:
  • Create and deliver a product using the publicly released specifications and technologies
  • But you may *not* claim that it is "compliant" unless they enter and pass conformance testing
  • And if the product is a software or hardware engine, you may not advertise it using the Khronos API technology logos or trademarks    "
Regarding the second and the third point , that's ok from an OpenSource project perspective . As it provides us, both the freedoms
of free speech & free beer , but asks us not to publicize it :) (without conformance testing).
 
As a side note - I had a talk with Alex about using EBGM a few days ago, and we decided not to use it for the moment. We don't have anything against the algorithm, it's just that if only I did it, there won't be enough time to implement all the
 
Actually, freely available implementations of EBGM are available (as i had pointed out , in a reply to Marcel's mail sometime back the implementations can be found at  http://malic.sourceforge.net/ and also at http://www.cs.colostate.edu/evalfacerec/algorithms5.html ). Also i would like to interact with you'll on IRC,on which IRC do you'll (Alex and you) usually meetup.

IMHO , there would be sufficient time with respect to create the tagging widget since its quite easy  to write plugins for
digikam.Also the EBGM algorithm will not take more than about a month as the free implementations already
exist. I would have to use the OpenCL API to modify the necessary portions of the already existing code.

algorithms and also complete the tagging part within the GSoC period. We definitely want to have eigenface and fisherface, despite the limitations - the retraining is slow only if there are more than ~400 tagged friends in the database, which is a rarity. The main concern is pose variation - for that, I plan to link multiple poses of the same person with the same ID. As a consequence, the initial accuracy while training shall be less, but after some time it would be good enough.
 
As you mentioned , your main concern is pose variation. Fisherface is IMHO ( from my work on face recognition in the past 1 year ) not the right way to go for the following reasons:

1. Fisherfaces uses the same methodology as eigenfaces ( which is easily known from a preliminary survey of the subject) and   
   doesn't yield satisfactory results in expression and pose variant images.

2. Only advantage of fisherface over eigenface is that it makes the recognition illumination invariant. But that's not much of a
    problem as in family / group photos that digikam will mostly encounter photos taken with camera flash and outdoor photographs
    which result in well lit photographs.

Also the problem with pose and retraining of eigenfaces and fisherfaces is as follows:

Assume you are training your recognition model with training images of a single person.
Since eigenfaces and fisherfaces rely on the nearest neighbour classifier for recognition you have to train the model with more number of images to give satisfactory recognition results.

Now let us look at how many training images we would need (of a single person) for satisfactory recognition results.

Assuming a person looking straight at the camera, a 1 degree variation in pose from 0 degree ( face towards left) to 180 degree (face towards right) would result in 180 images.

Now if the person looks upwards 1 degree (again we would have 180 images from left to right) and if we keep varying poses we would get approximately 180 x 180 images of the same person.

Also for each successive pose varied image added to the training set the training time would increase exponentially.

Now, there are two problems to this:

1.There will not be sufficient training images (so many pose variations of a single person is difficult to get)
   to get satisfactory results.i.e Training would take a long time.

2.Since Eigenfaces and Fisherfaces ( in general Principal Component based models ) calculate a single set of            
   eigenfaces/fisherfaces from the training set. For a well trained recognizer the training time would be enormous.
 

But still, if you can implement EBGM for libface, it'd be great :) We'd have one more algorithm in the bag. It's just that one person can't finish everything in time. So since you're willing, start EBGM then. Eigenfaces is almost complete.

I would love to add the implementation to libface. But looking at the similarity between Eigenfaces and Fisherfaces IMHO the
effort should be to get as many different algorithms implemented as possible.

PS: Hopefully someone will figure out if OpenCL can be used in digiKam or not.

more opinions / suggestions from the digikam community are welcome.
 



On Sat, Apr 17, 2010 at 9:21 PM, kunal ghosh <[hidden email]> wrote:
Hi,
I am implementing the face recognition algorithm for digikam, and wanted to use GPGPU frameworks for the same. But
was not able to decide which framework to use OpenCL or CUDA (C for CUDA specifically).

PS: I am willingly not including any more information about either of the above frameworks to attract unbiased opinions.

Also, i could code part that would execute on the GPU in python , shortening the development cycle. Good python bindings exist for either GPU programming frameworks. Py[CUDA,OpenCL] are the bindings.

Python functions are easily callable from within C/C++ code as demonstrated by Link 1 Link2 and Link3
so, is it fine if the algorithms are implemented in python and then called from within digikam.

all suggestions , comments welcome.

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel




--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel




--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Aditya Bhatt
Hi Kunal,
 
Actually, there is a small interpretation error, the conditions regarding publishable usage is only for the adopters (ie. NVIDIA or ATI etc http://www.khronos.org/members/conformant/ ) who come up with OpenCL compliant drivers / API implementation. As mentioned in the link "main adopters page" we as implementers should not have any problems. (Quoting from the site:http://www.khronos.org/adopters)

"  Implementers - for no cost or license fees you may:
  • Create and deliver a product using the publicly released specifications and technologies
  • But you may *not* claim that it is "compliant" unless they enter and pass conformance testing
  • And if the product is a software or hardware engine, you may not advertise it using the Khronos API technology logos or trademarks    "
Regarding the second and the third point , that's ok from an OpenSource project perspective . As it provides us, both the freedoms
of free speech & free beer , but asks us not to publicize it :) (without conformance testing).

Ok then. But what I'm not sure about is : if we incorporate this in libface and consequently digiKam, doesn't that amount to publicizing it? If not, then that's great with me. But please ask the others about their opinions too - we'd want as less dependencies as possible.
 
Actually, freely available implementations of EBGM are available (as i had pointed out , in a reply to Marcel's mail sometime back the implementations can be found at  http://malic.sourceforge.net/ and also at http://www.cs.colostate.edu/evalfacerec/algorithms5.html ). Also i would like to interact with you'll on IRC,on which IRC do you'll (Alex and you) usually meetup.


Yes, the CSU project was also my first choice for EBGM.
 
Now let us look at how many training images we would need (of a single person) for satisfactory recognition results.

Assuming a person looking straight at the camera, a 1 degree variation in pose from 0 degree ( face towards left) to 180 degree (face towards right) would result in 180 images.

Now if the person looks upwards 1 degree (again we would have 180 images from left to right) and if we keep varying poses we would get approximately 180 x 180 images of the same person.

Also for each successive pose varied image added to the training set the training time would increase exponentially.


Actually, no - we can assume about 10-20 degrees tolerance. What I think is, ~4 to 5 representative images per person should be okay for a photo management suite. What is required for a person is : A frontal face, a profile face, and a sideways face. And then more faces can be added on the fly as more variations are encountered. This way, we get a denser sampling over pose as time proceeds.
And as I already said, the average number of friends/acquaintances that people would tag in photos is not so large as to noticeably slow down the re-training.
And compared to eigenfaces, the PCA + LDA approach is, to some extent, able to accomodate pose variations (20-30 degrees) and therefore fisherfaces should be fine for digiKam if the above method is followed.

I would love to add the implementation to libface. But looking at the similarity between Eigenfaces and Fisherfaces IMHO the
effort should be to get as many different algorithms implemented as possible.


Very true. But for now, the priority is to get fisherfaces up-and-running. It's sort of like an insurance of sorts - fisherfaces should be quite easy to implement and although less accurate than EBGM, it is okay enough to be incorporated into digiKam. So we'll definitely have at least one working algorithm. Meanwhile, we might want EBGM at some point, so jump in :)

And I and Alex (and the rest of the digiKam team, for that matter) doesn't communicate much via IRC, we use the ML more.
Get yourself subscribed to the libface ML. As for me, I usually idle in #digikam, #kde, and #kde-in. I think I've talked to you before on #kde-in :)

Cheers,
--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Aditya Bhatt
PS: Google for H-Eigenfaces (aka Hybrid-Eigenfaces). It is a nifty tweak to the original method. A professor in my university just showed me a paper that proposes this method, and it greatly solves the problem of pose variation.

On Sun, Apr 18, 2010 at 10:20 AM, Aditya Bhatt <[hidden email]> wrote:
Hi Kunal,
 
Actually, there is a small interpretation error, the conditions regarding publishable usage is only for the adopters (ie. NVIDIA or ATI etc http://www.khronos.org/members/conformant/ ) who come up with OpenCL compliant drivers / API implementation. As mentioned in the link "main adopters page" we as implementers should not have any problems. (Quoting from the site:http://www.khronos.org/adopters)

"  Implementers - for no cost or license fees you may:
  • Create and deliver a product using the publicly released specifications and technologies
  • But you may *not* claim that it is "compliant" unless they enter and pass conformance testing
  • And if the product is a software or hardware engine, you may not advertise it using the Khronos API technology logos or trademarks    "
Regarding the second and the third point , that's ok from an OpenSource project perspective . As it provides us, both the freedoms
of free speech & free beer , but asks us not to publicize it :) (without conformance testing).

Ok then. But what I'm not sure about is : if we incorporate this in libface and consequently digiKam, doesn't that amount to publicizing it? If not, then that's great with me. But please ask the others about their opinions too - we'd want as less dependencies as possible.
 
Actually, freely available implementations of EBGM are available (as i had pointed out , in a reply to Marcel's mail sometime back the implementations can be found at  http://malic.sourceforge.net/ and also at http://www.cs.colostate.edu/evalfacerec/algorithms5.html ). Also i would like to interact with you'll on IRC,on which IRC do you'll (Alex and you) usually meetup.


Yes, the CSU project was also my first choice for EBGM.
 
Now let us look at how many training images we would need (of a single person) for satisfactory recognition results.

Assuming a person looking straight at the camera, a 1 degree variation in pose from 0 degree ( face towards left) to 180 degree (face towards right) would result in 180 images.

Now if the person looks upwards 1 degree (again we would have 180 images from left to right) and if we keep varying poses we would get approximately 180 x 180 images of the same person.

Also for each successive pose varied image added to the training set the training time would increase exponentially.


Actually, no - we can assume about 10-20 degrees tolerance. What I think is, ~4 to 5 representative images per person should be okay for a photo management suite. What is required for a person is : A frontal face, a profile face, and a sideways face. And then more faces can be added on the fly as more variations are encountered. This way, we get a denser sampling over pose as time proceeds.
And as I already said, the average number of friends/acquaintances that people would tag in photos is not so large as to noticeably slow down the re-training.
And compared to eigenfaces, the PCA + LDA approach is, to some extent, able to accomodate pose variations (20-30 degrees) and therefore fisherfaces should be fine for digiKam if the above method is followed.

I would love to add the implementation to libface. But looking at the similarity between Eigenfaces and Fisherfaces IMHO the
effort should be to get as many different algorithms implemented as possible.


Very true. But for now, the priority is to get fisherfaces up-and-running. It's sort of like an insurance of sorts - fisherfaces should be quite easy to implement and although less accurate than EBGM, it is okay enough to be incorporated into digiKam. So we'll definitely have at least one working algorithm. Meanwhile, we might want EBGM at some point, so jump in :)

And I and Alex (and the rest of the digiKam team, for that matter) doesn't communicate much via IRC, we use the ML more.
Get yourself subscribed to the libface ML. As for me, I usually idle in #digikam, #kde, and #kde-in. I think I've talked to you before on #kde-in :)

Cheers,
--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net



--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Gilles Caulier-4
In reply to this post by kunal ghosh
For my viewpoint, it's always a bad idea to force implementation to
use specific API relevant of hardware specification. Your code must be
compiled as well without an indeed dependency. Optionally of course
providing a way to speedup computation is always welcome.

For ex, in libraw, we use OpenMp to parallelize computation during RAW
demosaicing. OpenMP is GNU compatible and available with GCC and other
compiler as M$ VisualC

Other consideration is platform compatible. Always use a libs which is
available under all platforms, as Linux, Mac and windows.

My 10cts€

Gilles Caulier

2010/4/17 kunal ghosh <[hidden email]>:

> Hi,
> I am implementing the face recognition algorithm for digikam, and wanted to
> use GPGPU frameworks for the same. But
> was not able to decide which framework to use OpenCL or CUDA (C for CUDA
> specifically).
>
> PS: I am willingly not including any more information about either of the
> above frameworks to attract unbiased opinions.
>
> Also, i could code part that would execute on the GPU in python , shortening
> the development cycle. Good python bindings exist for either GPU programming
> frameworks. Py[CUDA,OpenCL] are the bindings.
>
> Python functions are easily callable from within C/C++ code as demonstrated
> by Link 1 Link2 and Link3
> so, is it fine if the algorithms are implemented in python and then called
> from within digikam.
>
> all suggestions , comments welcome.
>
> --
> regards
> -------
> Kunal Ghosh
> Dept of Computer Sc. & Engineering.
> Sir MVIT
> Bangalore,India
>
> Quote:
> "Ignorance is not a sin, the persistence of ignorance is"
> --
> "If you find a task difficult today, you'll find it difficult 10yrs later
> too !"
> -----
> "Failing to Plan is Planning to Fail"
>
> Blog:kunalghosh.wordpress.com
> Website:www.kunalghosh.net46.net
> V-card:http://tinyurl.com/86qjyk
>
>
> _______________________________________________
> Digikam-devel mailing list
> [hidden email]
> https://mail.kde.org/mailman/listinfo/digikam-devel
>
>
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

kunal ghosh
In reply to this post by Aditya Bhatt
Hi aditya,

PS: Google for H-Eigenfaces (aka Hybrid-Eigenfaces). It is a nifty tweak to the original method. A professor in my university just showed me a paper that proposes this method, and it greatly solves the problem of pose variation.

would it be possible for you to forward the relevant paper to me , personal mail ? I am right now not at the disposal of my college
springer/elsevier/ieee accounts.

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

kunal ghosh
In reply to this post by Aditya Bhatt

Hi Aditya,

On Sun, Apr 18, 2010 at 10:26 AM, Aditya Bhatt <[hidden email]> wrote:
PS: Google for H-Eigenfaces (aka Hybrid-Eigenfaces). It is a nifty tweak to the original method. A professor in my university just showed me a paper that proposes this method, and it greatly solves the problem of pose variation.

As we discussed on #digikam, in the paper "Pose invariant virtual classifiers from single training image using novel
hybrid-eigenfaces" published in http://linkinghub.elsevier.com/retrieve/pii/S0925231210001475
in page 6, right column, second paragraph it mentions that.
[Quoting]

"It should be noted that the synthesized virtual views would
strictly be under the same pose as H-eigenfaces so only those pose variations can be obtained in training images which are present in H-eigenfaces. It is required to have a face dataset consisting of different subject’s face images under different viewpoints to obtain H-eigenfaces under those viewpoints. Consequently, the method relies on the availability of a generic face dataset
containing face images under different pose
. In this article FERET face Database [38] serves the purpose of generic dataset."

[/Quoting]

( from the italicized text ) So many pose varied images of a person are readily available in a face database as FERET but difficult
to get in a Personal Photo album. ( Usage of the system overtime will increase recognition results but users may not continue
using the system for that long ! )
 
--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Gerhard Kulzer
In reply to this post by kunal ghosh
Without knowing much about the practical status of OpenCl as of today, it seems the natural choice to me:
  • open standard
  • supports major HW (nVidia and AMD)
  • cross platform (Linux, Mac, Windows...)
Gerhard

On Sat, Apr 17, 2010 at 5:51 PM, kunal ghosh <[hidden email]> wrote:
Hi,
I am implementing the face recognition algorithm for digikam, and wanted to use GPGPU frameworks for the same. But
was not able to decide which framework to use OpenCL or CUDA (C for CUDA specifically).

PS: I am willingly not including any more information about either of the above frameworks to attract unbiased opinions.

Also, i could code part that would execute on the GPU in python , shortening the development cycle. Good python bindings exist for either GPU programming frameworks. Py[CUDA,OpenCL] are the bindings.

Python functions are easily callable from within C/C++ code as demonstrated by Link 1 Link2 and Link3
so, is it fine if the algorithms are implemented in python and then called from within digikam.

all suggestions , comments welcome.

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel



_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Aditya Bhatt
In reply to this post by kunal ghosh
Hi Kunal,

There was a slight error in your interpretation of their method :

( from the italicized text ) So many pose varied images of a person are readily available in a face database as FERET but difficult
to get in a Personal Photo album. ( Usage of the system overtime will increase recognition results but users may not continue
using the system for that long ! )
 

They actually use the FERET database to train the coefficients for the reconstruction of the profile/side face into a frontal face. Later, those same coefficients can be used to map a non-FERET profile face to it's virtual frontal equivalent. Therefore, I, as a developer, can generate these coefficients using FERET's huge database, and then ship bundle a file with libface containing the values, for end-users to use :)

There is a very nice paper ( admittedly better-framed than the one I showed you ), that explains how GLR (Global Linear Regression) can be applied to predict the frontal face from the profile view : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.8750&rep=rep1&type=pdf

In fact, the authors of this paper go one step further and explain LLR, or Local Linear Regression, which applies the above GLR algorithm to localized "patches" of a face to achieve much finer accuracy in rotation.

So as I see it, this method is well-suited for the problem at hand

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel



Cheers,
--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: GPU Implementation of image processing algorithms

Aditya Bhatt
PS: Combining the above algorithm with LDA should give even better results, solving both pose and illumination variation.

On Sun, Apr 18, 2010 at 8:54 PM, Aditya Bhatt <[hidden email]> wrote:
Hi Kunal,

There was a slight error in your interpretation of their method :

( from the italicized text ) So many pose varied images of a person are readily available in a face database as FERET but difficult
to get in a Personal Photo album. ( Usage of the system overtime will increase recognition results but users may not continue
using the system for that long ! )
 

They actually use the FERET database to train the coefficients for the reconstruction of the profile/side face into a frontal face. Later, those same coefficients can be used to map a non-FERET profile face to it's virtual frontal equivalent. Therefore, I, as a developer, can generate these coefficients using FERET's huge database, and then ship bundle a file with libface containing the values, for end-users to use :)

There is a very nice paper ( admittedly better-framed than the one I showed you ), that explains how GLR (Global Linear Regression) can be applied to predict the frontal face from the profile view : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.8750&rep=rep1&type=pdf

In fact, the authors of this paper go one step further and explain LLR, or Local Linear Regression, which applies the above GLR algorithm to localized "patches" of a face to achieve much finer accuracy in rotation.

So as I see it, this method is well-suited for the problem at hand

--
regards
-------
Kunal Ghosh
Dept of Computer Sc. & Engineering.
Sir MVIT
Bangalore,India

Quote:
"Ignorance is not a sin, the persistence of ignorance is"
--
"If you find a task difficult today, you'll find it difficult 10yrs later too !"
-----
"Failing to Plan is Planning to Fail"

Blog:kunalghosh.wordpress.com
Website:www.kunalghosh.net46.net
V-card:http://tinyurl.com/86qjyk


_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel



Cheers,
--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net



--
Aditya Bhatt
Blog : http://adityabhatt.wordpress.com
Face Recognition Library : http://libface.sourceforge.net

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel